Python: Rapid Reverse Readline
Written by Eric Schwimmer
There are many solutions out there with which to beat this particular ex-equus, but none of them were quite performant enough for me. The particular software I was working on was reading multi-terabyte files, starting from the end, so loading the entire file into memory was obviously a non-starter. And given the ginormity of the files I was parsing, the code had to be as tight as possible. Here is what I came up with:
# revReadline: Take an read-mode filehandle and
# return a generator that will read the file backwards,
# line by line
def revReadline(fh, bufSize=4096):
fh.seek(-bufSize,2)
filePos = 1
fragment = ""
while filePos > 0:
readBuffer = fullBuffer = ""
while "\n" not in readBuffer and filePos > 0:
readBuffer = fh.read(bufSize)
fullBuffer += readBuffer
fh.seek(bufSize*-2, 1)
filePos = fh.tell()
lines = fullBuffer.split("\n")
lines[-1] += fragment
while len(lines) > 1:
yield lines.pop()
fragment = lines[0]
yield fragment