HW1 | Python Fiddle

'''

• id: the id of a training set question pair
• qid1, qid2: unique ids of each question (only available in train.csv)
• question1, question2: the full text of each question
• is duplicate - the target variable, set to 1 if question1 and question2 have essentially the same meaning, and
0 otherwise

Report the following:
1. Overlapping scores for the first 10 lines
2. Maximum overlapping score, minimum overlapping score, medium overlapping score
3. Briefly discuss your findings.

Questions:
How to select the correct column?
How to distinguish between sentence columns?
'''

import os
import sys

'''
Change function to split string between commas 3,4 and 4,5

# Test our method.
test = "Stately, plump Buck Mulligan came from the stairhead."
result = first_words(test, 4) #splits by 4 spaces -> switch to comma
print(result)
=> Stately, plump Buck Mulligan

'''

def first_words(input, words):
    for i in range(0, len(input)):
        # Count spaces in the string.
        if input[i] == ' ':
            words -= 1
        if words == 0:
            # Return the slice up to this point.
            return input[0:i]
    return ""

with open('training.csv', 'rb') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
    for row in spamreader:
        print ', '.join(row)

Browser Version Not Supported

Due to Python Fiddle's reliance on advanced JavaScript techniques, older browsers might have problems running it correctly. Please download the latest version of your favourite browser.
Chrome 10+ Firefox 4+ Safari 5+
IE 10+

Let me try anyway!

Python Fiddle

Python Cloud IDE