Kevinello

发表于2020-11-27|技术文章

常规的读取大文件的步骤 12345678910111213141516171819import pandas as pdf = open('./data/ows-raw.txt',encoding='utf-8')reader = pd.read_table(f, sep=',', iterator=True, error_bad_lines=False) #跳过报错行loop = TruechunkSize = 100000chunks = []while loop: try: chunk = reader.get_chunk(chunkSize) chunks.append(chunk) except StopIteration: loop = False print("Iteration is stopped.")df = pd.concat(chunks, ignore_index=True) STORY 这几天有一个需求是 ...