#include <jsonreader.h>
Public Member Functions | |
wxJSONReader (int flags=wxJSONREADER_TOLERANT, int maxErrors=30) | |
Ctor. | |
virtual | ~wxJSONReader () |
Dtor - does nothing. | |
int | Parse (const wxString &doc, wxJSONValue *val) |
Parse the JSON document. | |
int | Parse (wxInputStream &doc, wxJSONValue *val) |
int | GetDepth () const |
Return the depth of the JSON input text. | |
int | GetErrorCount () const |
Return the size of the error message's array. | |
int | GetWarningCount () const |
Return the size of the warning message's array. | |
const wxArrayString & | GetErrors () const |
Return a reference to the error message's array. | |
const wxArrayString & | GetWarnings () const |
Return a reference to the warning message's array. | |
Static Public Member Functions | |
static int | UTF8NumBytes (char ch) |
Compute the number of bytes that makes a UTF-8 encoded wide character. | |
static bool | Strtoll (const wxString &str, wxInt64 *i64) |
Converts a decimal string to a 64-bit signed integer. | |
static bool | Strtoull (const wxString &str, wxUint64 *ui64) |
Converts a decimal string to a 64-bit unsigned integer. | |
static bool | DoStrto_ll (const wxString &str, wxUint64 *ui64, wxChar *sign) |
Perform the actual conversion from a string to a 64-bit integer. | |
Protected Member Functions | |
int | DoRead (wxInputStream &doc, wxJSONValue &val) |
Reads the JSON text document (internal use). | |
void | AddError (const wxString &descr) |
Add a error message to the error's array. | |
void | AddError (const wxString &fmt, const wxString &str) |
void | AddError (const wxString &fmt, wxChar ch) |
void | AddWarning (int type, const wxString &descr) |
Add a warning message to the warning's array. | |
int | GetStart (wxInputStream &is) |
Returns the start of the document. | |
int | ReadChar (wxInputStream &is) |
Read a character from the input JSON document. | |
int | PeekChar (wxInputStream &is) |
Peek a character from the input JSON document. | |
void | StoreValue (int ch, const wxString &key, wxJSONValue &value, wxJSONValue &parent) |
Store a value in the parent object. | |
int | SkipWhiteSpace (wxInputStream &is) |
Skip all whitespaces. | |
int | SkipComment (wxInputStream &is) |
Skip a comment. | |
void | StoreComment (const wxJSONValue *parent) |
Store the comment string in the value it refers to. | |
int | ReadString (wxInputStream &is, wxJSONValue &val) |
Read a string value. | |
int | ReadToken (wxInputStream &is, int ch, wxString &s) |
Reads a token string. | |
int | ReadValue (wxInputStream &is, int ch, wxJSONValue &val) |
Read a value from input stream. | |
int | ReadUES (wxInputStream &is, char *uesBuffer) |
Read a 4-hex-digit unicode character. | |
int | AppendUES (wxMemoryBuffer &utf8Buff, const char *uesBuffer) |
The function appends a Unice Escaped Sequence to the temporary UTF8 buffer. | |
int | NumBytes (char ch) |
Return the number of bytes that make a character in stream input. | |
int | ConvertCharByChar (wxString &s, const wxMemoryBuffer &utf8Buffer) |
Convert a UTF-8 memory buffer one char at a time. | |
Protected Attributes | |
int | m_flags |
Flag that control the parser behaviour,. | |
int | m_maxErrors |
Maximum number of errors stored in the error's array. | |
int | m_lineNo |
The current line number (start at 1). | |
int | m_colNo |
The current column number (start at 1). | |
int | m_level |
The current level of object/array annidation (start at ZERO). | |
int | m_depth |
The depth level of the read JSON text. | |
wxJSONValue * | m_current |
The pointer to the value object that is being read. | |
wxJSONValue * | m_lastStored |
The pointer to the value object that was last stored. | |
wxJSONValue * | m_next |
The pointer to the value object that will be read. | |
wxString | m_comment |
The comment string read by SkipComment(). | |
int | m_commentLine |
The starting line of the comment string. | |
wxArrayString | m_errors |
The array of error messages. | |
wxArrayString | m_warnings |
The array of warning messages. | |
int | m_peekChar |
The character read by the PeekChar() function (-1 none). | |
bool | m_noUtf8 |
ANSI: do not convert UTF-8 strings. |
The class is a JSON parser which reads a JSON formatted text and stores values in the wxJSONValue
structure. The ctor accepts two parameters: the style flag, which controls how much error-tolerant should the parser be and an integer which is the maximum number of errors and warnings that have to be reported (the default is 30).
If the JSON text document does not contain an open/close JSON character the function returns an invalid value object; in other words, the wxJSONValue::IsValid() function returns FALSE. This is the case of a document that is empty or contains only whitespaces or comments. If the document contains a starting object/array character immediatly followed by a closing object/array character (i.e.: {}
) then the function returns an empty array or object JSON value. This is a valid JSON object of type wxJSONTYPE_OBJECT or wxJSONTYPE_ARRAY whose wxJSONValue::Size() function returns ZERO.
Note that the parsing process stops when the internal DoRead() function returns. Because that function is recursive, the top-level close-object '}' or close-array ']' character cause the top-level DoRead() function to return thus stopping the parsing process regardless the EOF condition. This means that the JSON input text may contain anything after the top-level close-object/array character. Here are some examples:
Returns a wxJSONTYPE_INVALID value (invalid JSON value)
// this text does not contain an open array/object character
Returns a wxJSONTYPE_OBJECT value of Size() = 0
{ }
Returns a wxJSONTYPE_ARRAY value of Size() = 0
[ ]
Text before and after the top-level open/close characters is ignored.
This non-JSON text does not cause the parser to report errors or warnings { } This non-JSON text does not cause the parser to report errors or warnings
null
, true
and false
must be lowercase; the wxJSON parser also recognizes mixed case literals such as, for example, Null or FaLSe. A warning is emitted.
When the input is from a stream object, the only recognized encoding format is UTF-8 for both ANSI and Unicode builds.
wxJSONValue value; wxJSONReader reader; // open a text file that contains the UTF-8 encoded JSON text wxFFileInputStream jsonText( _T("filename.utf8"), _T("r")); // read the file int numErrors = reader.Parse( jsonText, &value ); if ( numErrors > 0 ) { ::MessageBox( _T("Error reading the input file")); }
Starting from version 1.1.0 the wxJSON reader and the writer has changed in their internal organization. To know more about ANSI and Unicode mode read Unicode support in wxJSON.
wxJSONReader::wxJSONReader | ( | int | flags = wxJSONREADER_TOLERANT , |
|
int | maxErrors = 30 | |||
) |
Ctor.
Construct a JSON parser object with the given parameters.
JSON parser objects should always be constructed on the stack but it does not hurt to have a global JSON parser.
flags | this paramter controls how much error-tolerant should the parser be | |
maxErrors | the maximum number of errors (and warnings, too) that are reported by the parser. When the number of errors reaches this limit, the parser stops to read the JSON input text and no other error is reported. |
flag
parameter is the combination of ZERO or more of the following constants OR'ed toghether:
flags
.
wxJSONReader reader( wxJSONREADER_TOLERANT | wxJSONREADER_STORE_COMMENTS ); wxJSONValue root; int numErrors = reader.Parse( jsonText, &root );
wxJSONReader::~wxJSONReader | ( | ) | [virtual] |
Dtor - does nothing.
int wxJSONReader::Parse | ( | const wxString & | doc, | |
wxJSONValue * | val | |||
) |
Parse the JSON document.
The two overloaded versions of the Parse()
function read a JSON text stored in a wxString object or in a wxInputStream object.
If val
is a NULL pointer, the function does not store the values: it can be used as a JSON checker in order to check the syntax of the document. Returns the number of errors found in the document. If the returned value is ZERO and the parser was constructed with the wxJSONREADER_STRICT
flag, then the parsed document is well-formed and it only contains valid JSON text.
If the wxJSONREADER_TOLERANT
flag was used in the parser's constructor, then a return value of ZERO does not mean that the document is well-formed because it may contain comments and other extensions that are not fatal for the wxJSON parser but other parsers may fail to recognize. You can use the GetWarningCount()
function to know how many wxJSON extensions are present in the JSON input text.
Note that the JSON value object val
is not cleared by this function unless its type is of the wrong type. In other words, if val
is of type wxJSONTYPE_ARRAY and it already contains 10 elements and the input document starts with a '[' (open-array char) then the elements read from the document are appended to the existing ones.
On the other hand, if the text document starts with a '{' (open-object) char then this function must change the type of the val
object to wxJSONTYPE_OBJECT
and the old content of 10 array elements will be lost.
doc | the JSON text that has to be parsed | |
val | the wxJSONValue object that contains the parsed text; if NULL the parser do not store anything but errors and warnings are reported |
int wxJSONReader::Parse | ( | wxInputStream & | doc, | |
wxJSONValue * | val | |||
) |
int wxJSONReader::GetDepth | ( | ) | const |
Return the depth of the JSON input text.
The function returns the number of times the recursive DoRead
function was called in the parsing process thus returning the maximum depth of the JSON input text.
int wxJSONReader::GetErrorCount | ( | ) | const |
Return the size of the error message's array.
int wxJSONReader::GetWarningCount | ( | ) | const |
Return the size of the warning message's array.
const wxArrayString & wxJSONReader::GetErrors | ( | ) | const |
Return a reference to the error message's array.
const wxArrayString & wxJSONReader::GetWarnings | ( | ) | const |
Return a reference to the warning message's array.
int wxJSONReader::UTF8NumBytes | ( | char | ch | ) | [static] |
Compute the number of bytes that makes a UTF-8 encoded wide character.
The function counts the number of '1' bit in the character ch
and returns it. The UTF-8 encoding specifies the number of bytes needed by a wide character by coding it in the first byte. See below.
Note that if the character does not contain a valid UTF-8 encoding the function returns -1.
UCS-4 range (hex.) UTF-8 octet sequence (binary) ------------------- ----------------------------- 0000 0000-0000 007F 0xxxxxxx 0000 0080-0000 07FF 110xxxxx 10xxxxxx 0000 0800-0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx 0001 0000-001F FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 0020 0000-03FF FFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 0400 0000-7FFF FFFF 1111110x 10xxxxxx ... 10xxxxxx
bool wxJSONReader::Strtoll | ( | const wxString & | str, | |
wxInt64 * | i64 | |||
) | [static] |
Converts a decimal string to a 64-bit signed integer.
This function implements a simple variant of the strtoll C-library function. I needed this implementation because the wxString::To(U)LongLong function does not work on my system:
Note that this implementation is not a complete substitute of the strtoll function because it only converts decimal strings (only base 10 is implemented).
str | the string that contains the decimal literal | |
i64 | the pointer to long long which holds the converted value |
bool wxJSONReader::Strtoull | ( | const wxString & | str, | |
wxUint64 * | ui64 | |||
) | [static] |
Converts a decimal string to a 64-bit unsigned integer.
Similar to Strtoll
but for unsigned integers
bool wxJSONReader::DoStrto_ll | ( | const wxString & | str, | |
wxUint64 * | ui64, | |||
wxChar * | sign | |||
) | [static] |
Perform the actual conversion from a string to a 64-bit integer.
This function is called internally by the Strtoll
and Strtoull
functions and it does the actual conversion. The function is also able to check numeric overflow.
str | the string that has to be converted | |
ui64 | the pointer to a unsigned long long that holds the converted value | |
sign | the pointer to a wxChar character that will get the sign of the literal string, if any |
int wxJSONReader::DoRead | ( | wxInputStream & | is, | |
wxJSONValue & | parent | |||
) | [protected] |
Reads the JSON text document (internal use).
This is a recursive function that is called by Parse()
and by the DoRead()
function itself when a new object / array character is encontered. The function returns when a EOF condition is encontered or when the corresponding close-object / close-array char is encontered. The function also increments the m_level
data member when it is entered and decrements it on return. It also sets m_depth
equal to m_level
if m_depth
is less than m_level
.
The function is the heart of the wxJSON parser class but it is also very easy to understand because JSON syntax is very easy.
Returns the last close-object/array character read or -1 on EOF.
is | the input stream that contains the JSON text | |
parent | the JSON value object that is the parent of all subobjects read by the function until the next close-object/array (for the top-level DoRead function parent is the root JSON object) |
void wxJSONReader::AddError | ( | const wxString & | msg | ) | [protected] |
Add a error message to the error's array.
The overloaded versions of this function add an error message to the error's array stored in m_errors
. The error message is formatted as follows:
Error: line xxx, col xxx - <error_description>
The msg
parameter is the description of the error; line's and column's number are automatically added by the functions. The fmt
parameter is a format string that has the same syntax as the printf function. Note that it is the user's responsability to provide a format string suitable with the arguments: another string or a character.
void wxJSONReader::AddError | ( | const wxString & | fmt, | |
const wxString & | str | |||
) | [protected] |
void wxJSONReader::AddError | ( | const wxString & | fmt, | |
wxChar | ch | |||
) | [protected] |
void wxJSONReader::AddWarning | ( | int | type, | |
const wxString & | msg | |||
) | [protected] |
Add a warning message to the warning's array.
The warning description is as follows:
Warning: line xxx, col xxx - <warning_description>
Warning messages are generated by the parser when the JSON text that has been read is not well-formed but the error is not fatal and the parser recognizes the text as an extension to the JSON standard (see the parser's ctor for more info about wxJSON extensions).
Note that the parser has to be constructed with a flag that indicates if each individual wxJSON extension is on. If the warning message is related to an extension that is not enabled in the parser's m_flag
data member, this function calls AddError() and the warning message becomes an error message. The type
parameter is one of the same constants that specify the parser's extensions.
int wxJSONReader::GetStart | ( | wxInputStream & | is | ) | [protected] |
Returns the start of the document.
This is the first function called by the Parse() function and it searches the input stream for the starting character of a JSON text and returns it. JSON text start with '{' or '['. If the two starting characters are inside a C/C++ comment, they are ignored. Returns the JSON-text start character or -1 on EOF.
is | the input stream that contains the JSON text |
int wxJSONReader::ReadChar | ( | wxInputStream & | is | ) | [protected] |
Read a character from the input JSON document.
The function returns the next byte from the UTF-8 stream as an INT. In case of errors or EOF, the function returns -1. The function also updates the m_lineNo
and m_colNo
data members and converts all CR+LF sequence in LF.
This function only returns one byte UTF-8 (one code unit) at a time and not Unicode code points. The only reason for this function is to process line and column numbers.
is | the input stream that contains the JSON text |
int wxJSONReader::PeekChar | ( | wxInputStream & | is | ) | [protected] |
Peek a character from the input JSON document.
This function just calls the Peek() function on the stream and returns it.
is | the input stream that contains the JSON text |
void wxJSONReader::StoreValue | ( | int | ch, | |
const wxString & | key, | |||
wxJSONValue & | value, | |||
wxJSONValue & | parent | |||
) | [protected] |
Store a value in the parent object.
The function is called by DoRead()
when a the comma or a close-object/array character is encontered and stores the current value read by the parser in the parent object. The function checks that value
is not invalid and that key
is not an empty string if parent
is an object.
ch | the character read: a comma or close objecty/array char | |
key | the key string: must be empty if parent is an array | |
value | the current JSON value to be stored in parent | |
parent | the JSON value that is the parent of value . |
int wxJSONReader::SkipWhiteSpace | ( | wxInputStream & | is | ) | [protected] |
Skip all whitespaces.
The function reads characters from the input text and returns the first non-whitespace character read or -1 if EOF. Note that the function does not rely on the isspace function of the C library but checks the space constants: space, TAB and LF.
int wxJSONReader::SkipComment | ( | wxInputStream & | is | ) | [protected] |
Skip a comment.
The function is called by DoRead() when a '/' (slash) character is read from the input stream assuming that a C/C++ comment is starting. Returns the first character that follows the comment or -1 on EOF. The function also adds a warning message because comments are not valid JSON text. The function also stores the comment, if any, in the m_comment
data member: it can be used by the DoRead() function if comments have to be stored in the value they refer to.
void wxJSONReader::StoreComment | ( | const wxJSONValue * | parent | ) | [protected] |
Store the comment string in the value it refers to.
The function searches a suitable value object for storing the comment line that was read by the parser and temporarly stored in m_comment
. The function searches the three values pointed to by:
m_next
m_current
m_lastStored
m_next
m_current
or m_latStoredint wxJSONReader::ReadString | ( | wxInputStream & | is, | |
wxJSONValue & | val | |||
) | [protected] |
Read a string value.
The function reads a string value from input stream and it is called by the DoRead()
function when it enconters the double quote characters. The function read all bytes up to the next double quotes (unless it is escaped) and stores them in a temporary UTF-8 memory buffer. Also, the function processes the escaped characters defined in the JSON syntax.
Next, the function tries to convert the UTF-8 buffer to a wxString object using the wxString::FromUTF8 function. Depending on the build mode, we can have the following:
[ "This is a very long string value which is splitted into more" "than one line because it is more human readable" ]
int wxJSONReader::ReadToken | ( | wxInputStream & | is, | |
int | ch, | |||
wxString & | s | |||
) | [protected] |
Reads a token string.
This function is called by the ReadValue() when the first character encontered is not a special char and it is not a double-quote. The only possible type is a literal or a number which all lies in the US-ASCII charset so their UTF-8 encodeing is the same as US-ASCII. The function simply reads one byte at a time from the stream and appends them to a wxString object. Returns the next character read.
A token cannot include unicode escaped sequences so this function does not try to interpret such sequences.
is | the input stream | |
ch | the character read by DoRead | |
s | the string object that contains the token read |
int wxJSONReader::ReadValue | ( | wxInputStream & | is, | |
int | ch, | |||
wxJSONValue & | val | |||
) | [protected] |
Read a value from input stream.
The function is called by DoRead() when it enconters a char that is not a special char nor a double-quote. It assumes that the string is a numeric value or a literal boolean value and stores it in the wxJSONValue object val
.
The function also checks that val
is of type wxJSONTYPE_INVALID otherwise an error is reported becasue a value cannot follow another value: maybe a (,) or (:) is missing.
If the literal starts with a digit, a plus or minus sign, the function tries to interpret it as a number. The following are tried by the function, in this order:
int wxJSONReader::ReadUES | ( | wxInputStream & | is, | |
char * | uesBuffer | |||
) | [protected] |
Read a 4-hex-digit unicode character.
The function is called by ReadString() when the \u sequence is encontered; the sequence introduces a control character in the form:
\uXXXX
uesBuffer
parameter.Returns the character after the hex sequence or -1 if EOF.
NOTICE: although the JSON syntax states that only control characters are represented in this way, the wxJSON library reads and recognizes all unicode characters in the BMP.
int wxJSONReader::AppendUES | ( | wxMemoryBuffer & | utf8Buff, | |
const char * | uesBuffer | |||
) | [protected] |
The function appends a Unice Escaped Sequence to the temporary UTF8 buffer.
This function is called by ReadString()
when a unicode escaped sequence is read from the input text as for example:
\u0001
which represents a control character. The uesBuffer
parameter contains the 4 hexadecimal digits that are read from ReadUES
.
The function tries to convert the 4 hex digits in a wchar_t character which is appended to the memory buffer utf8Buff
after converting it to UTF-8.
If the conversion from hexadecimal fails, the function does not store the character in the UTF-8 buffer and an error is reported. The function is the same in ANSI and Unicode. Returns -1 if the buffer does not contain valid hex digits. sequence. On success returns ZERO.
utf8Buff | the UTF-8 buffer to which the control char is written | |
uesBuffer | the four-hex-digits read from the input text |
int wxJSONReader::NumBytes | ( | char | ch | ) | [protected] |
Return the number of bytes that make a character in stream input.
This function returns the number of bytes that represent a unicode code point in various encoding. For example, if the input stream is UTF-32 the function returns 4. Because the only recognized format for streams is UTF-8 the function just calls UTF8NumBytes() and returns. The function is, actually, not used at all.
int wxJSONReader::ConvertCharByChar | ( | wxString & | s, | |
const wxMemoryBuffer & | utf8Buffer | |||
) | [protected] |
Convert a UTF-8 memory buffer one char at a time.
This function is used in ANSI mode when input from a stream is in UTF-8 format and the UTF-8 buffer read cannot be converted to the locale wxString object. The function performs a char-by-char conversion of the buffer and appends every representable character to the string s
. Characters that cannot be represented are stored as unicode escaped sequences in the form:
\uXXXX
int wxJSONReader::m_flags [protected] |
Flag that control the parser behaviour,.
int wxJSONReader::m_maxErrors [protected] |
Maximum number of errors stored in the error's array.
int wxJSONReader::m_lineNo [protected] |
The current line number (start at 1).
int wxJSONReader::m_colNo [protected] |
The current column number (start at 1).
int wxJSONReader::m_level [protected] |
The current level of object/array annidation (start at ZERO).
int wxJSONReader::m_depth [protected] |
The depth level of the read JSON text.
wxJSONValue* wxJSONReader::m_current [protected] |
The pointer to the value object that is being read.
wxJSONValue* wxJSONReader::m_lastStored [protected] |
The pointer to the value object that was last stored.
wxJSONValue* wxJSONReader::m_next [protected] |
The pointer to the value object that will be read.
wxString wxJSONReader::m_comment [protected] |
The comment string read by SkipComment().
int wxJSONReader::m_commentLine [protected] |
The starting line of the comment string.
wxArrayString wxJSONReader::m_errors [protected] |
The array of error messages.
wxArrayString wxJSONReader::m_warnings [protected] |
The array of warning messages.
int wxJSONReader::m_peekChar [protected] |
The character read by the PeekChar() function (-1 none).
bool wxJSONReader::m_noUtf8 [protected] |
ANSI: do not convert UTF-8 strings.