Python: Ignoring specific rows in a csv file -
Python: Ignoring specific rows in a csv file -
i trying create simple line graph compare columns 2 files. have written code , know how ignore lines in 2 .csv files have. code follows:
import numpy np import csv matplotlib import pyplot plt def read_cell(x, y): open('illumina_heart_gencode_paired_end_novel_junctions.csv', 'r') f: reader = csv.reader(f) y_count = 0 n in reader: if y_count == y: cell = n[x] homecoming cell y_count += 1 print(read_cell(6, 932) def read_cell(x, y): open('illumina_heart_refseq_paired_end_novel_junctions.csv', 'r') f: reader = csv.reader(f) y_count = 0 n in reader: if y_count == y: cell = n[x] homecoming cell y_count += 1 print(read_cell(6, 932)) d1 = [] in set1: try: d1.append(float(i[5])) except valueerror: go on d2 = [] in set2: try: d2.append(float(i[5])) except valueerror: go on min_len = len(d1) if len(d2) < min_len: min_len = len(d2) d1 = d1[0:min_len] d2 = d2[0:min_len] plt.plot(d1, d2, 'r*') plt.plot(d1, d2, 'b-') plt.xlabel('data set 1: pe_nj') plt.ylabel('data set 2: pe_sj') plt.show() the first csv file has 932 rows , sec 1 has 99,154 rows. interested in taking first 932 rows both files , want compare 7th column in both files.
how go doing that?
the first file looks this:
chr1 1718493 1718764 2 2 0 12 0 24 chr1 8928117 8930883 2 2 0 56 0 24 chr1 8930943 8931949 2 2 0 48 0 25 chr1 9616316 9627341 1 1 0 12 0 24 chr1 10166642 10167279 1 1 0 31 1 24 the sec file looks so:
chr1 880181 880421 2 2 0 15 0 21 chr1 1718493 1718764 2 2 0 12 0 24 chr1 8568735 8585817 2 2 0 12 0 21 chr1 8617583 8684368 2 2 0 14 0 23 chr1 8928117 8930883 2 2 0 56 0 24
one possible approach read lines first (shorter) file, find out length (n), read n lines sec file, take kth column interested both files.
something (adjusting delimiter case):
def read_tsv_file(fname): # reads total contents of tab-separated file (like have) homecoming list(csv.reader(open(fname, 'rb'), delimiter='\t')) def take_nth_column(first_array, second_array, n): # returns tuple containing nth columns both arrays, length corresponding length of smaller array len1 = len(first_array) len2 = len(second_array) min_len = len1 if len1<=len2 else len2 col1 = [row[n] row in first_array[:min_len]] col2 = [row[n] row in second_array[:min_len]] homecoming (col1, col2) first_array = read_tsv_file('your-first-file') second_array = read_tsv_file('your-second-file') (col1, col2) = take_nth_column(first_array, second_array, 7) python csv
Comments
Post a Comment