Friday, September 11, 2009

C++: if file line length exceeds array (buffer) length

Suppose we have implemented the following scenario:


// read input file line by line
// allocate 256 characters for each line

ifstream input_file("some_file.txt");
const int BUF_SIZE=256;
char buf[BUF_SIZE];
string s, strCurString;

if (!input_file.is_open())
{
cerr << "File some_file.txt coudl not be open!" << endl;
getch();
exit(EXIT_FAILURE);
}

while(!input_file.eof()) {
input_file.getline(buf, BUF_SIZE);
strCurString = buf;
s += strCurString;
}

cout << "File contents: " << endl << s << endl;


But what if the current file length exceeds BUF_SIZE? Well, in this case the while loop will never end, becoming an infinite loop. Why? Simply, because in the input file stream object a special bit (failbit) will be set, saying that the last getline() operation has failed (in this case not due to the end of a file, but due to the buffer length exceeding). In this case all subsequent calls to getline() will fail to read anything (can be seen by calling input_file.gcount(), which constantly returns 0 (zero) after the last getline() call that led to setting a failbit).


To overcome this, we can use a trick found here:


// read input file line by line
// allocate 256 characters for each line

ifstream input_file("some_file.txt");
const int BUF_SIZE=256;
char buf[BUF_SIZE];
string s, strCurString;

if (!input_file.is_open())
{
cerr << "File some_file.txt coudl not be open!" << endl;
getch();
exit(EXIT_FAILURE);
}

while(!input_file.eof()) {
input_file.getline(buf, BUF_SIZE);

// remember about failbit when amount of
// characters in the current line is
// more than BUF_SIZE
if (input_file.fail() && !input_file.eof())
// clear up the failbit and
// continue reading the input file
input_file.clear();
strCurString = buf;
s += strCurString;
}

cout << "File contents: " << endl << s << endl;

No comments: