How to parse a simple config file
This article explains how to parse a config file in the form name=value similar to the Windows .ini files. The code removes all whitespaces from the lines and skips empty lines and lines containing comments.
Type of input
The code explained in this article can parse both a formatted and unformatted input
# formatted generates_output=true file_format=txt
# unformatted generates_output = false file_format=doc
Short implementation summary
The code below parses the config file. After we successfully load the file, we read it line by line. We remove all the whitespaces from the line and skip the line if it is empty or contains a comment (indicated by "#"). After that, we split the string "name=value" at the delimiter "=" and print the name and the value.
#include <iostream> #include <fstream> #include <algorithm> int main() { // std::ifstream is RAII, i.e. no need to call close std::ifstream cFile ("config2.txt"); if (cFile.is_open()) { std::string line; while(getline(cFile, line)){ line.erase(std::remove_if(line.begin(), line.end(), isspace), line.end()); if(line[0] == '#' || line.empty()) continue; auto delimiterPos = line.find("="); auto name = line.substr(0, delimiterPos); auto value = line.substr(delimiterPos + 1); std::cout << name << " " << value << '\n'; } } else { std::cerr << "Couldn't open config file for reading.\n"; } }
Warning: If you added using namespace std at the top of your source code, you have to use ::isspace (from the global namespace) rather than isspace (from namespace std).
Implementation detail 1 - Removing whitespaces
Before doing any further processing we remove all the whitespaces from the line. This is accomplished with the help of several functions, namely erase(), remove_if() and isspace.
line.erase(std::remove_if(line.begin(), line.end(), isspace), line.end());
remove_if()
The function remove_if() takes a sequence (i.e. the line) and transforms it into a sequence without the undesired characters (i.e. the whitespaces). The length of the sequence does not get altered, however, the elements representing the undesired characters are moved to the end of the sequence and remain in an unspecified state. The function returns an iterator to the new end of the sequence. This is illustrated below. remove_if() takes three arguments. The first two arguments are forward iterators to the initial and final positions in the sequence. The last argument is a function pointer or a function object (in our case the address of the function isspace).
line.erase(remove_if(line.begin(), line.end(), isspace), line.end());
isspace
The function isspace checks whether an individual character is a whitespace character. Behind the scenes, isspace accepts a single element of the sequence as an argument and returns a value convertible to bool, i.e. true or false.
line.erase(std::remove_if(line.begin(), line.end(), isspace), line.end());
Note: In "C" locale whitespace characters include the space (’ ’), form feed (’\f’), line feed (’\n’), carriage return (’\r’), horizontal tab (’\t’), and vertical tab (’\v’). The backspace character (’\b’) is not a whitespace character in "C" locale. Different locales might define other whitespace characters. From C++ In a Nutshell: A Desktop Quick Reference By Ray Lischner.
string::erase()
The method std::string::erase() erases the sequence of characters in the range (first, last). In our case, it removes the part between the new end of the sequence returned by std::remove_if and the original end of the sequence. This can be seen in the figure below.
line.erase(std::remove_if(line.begin(), line.end(), isspace), line.end());
Implementation detail 2 - Splitting a string at the delimiter
The code below splits the line at the delimiter =. Notice, that we are processing the line under the assumption that there are no whitespace characters, as they have all been removed in the previous step. We firstly find the position of the delimiter = with std::string::find(). After that, we use the method std::string::substr(pos, len) to extract the name and the value. The method substr(pos,len) creates a substring starting at the position pos and spans len characters (or until the end of the string, whichever comes first). Notice, that in the second case we pass only the first parameter, i.e. the position. The default value, i.e. all characters until the end of the string, is used as the second parameter.
auto delimiterPos = line.find("="); auto name = line.substr(0, delimiterPos); auto value = line.substr(delimiterPos + 1);
Tagged: C++