A C style string file format conundrum -


I am very confused with this small problem I have a non-indexed file format header I (more special Form ID 3 header) Now, it stores a string or three bytes for header structure that the data is actually the point of an ID3 tag ( tag string beeted.), Now it's the The tag is not meaningless in the file format, so there are two things that can:

  • fread To load the entire file with sector and non-terminated string comparison, using strncmp .
  • What if someone opens it and tries to manipulate the string?
  • The second option is to load the file, but should not mapprate properly in the C-format file format, but include the appropriate tap-terminator, and then each The member should be loaded using a unique call. But, it also hacks and is tedious.

Support, especially those dealing with such things are appreciated.

If the file format specification says that 'T', 'A' in a certain three bytes, There are values ​​related to 'G' (84, 65, 71), then you should compare those three bytes.

For this example, strncmp () is OK. Generally, memcmp () is better because there is no need to worry about string termination, so if you are comparing ASCII NUL '\ 0' character in the byte stream (tag), memcmp () will work.

You also need to recognize that the file format you are working with is primarily printable data or whether it is primarily binary data. The techniques used for printable data can be distinguished from the techniques used for binary data; Techniques used for binary data are sometimes translated (but not always) for use with printable data. One major difference is that the length of values ​​in binary data is already known, either because Length is embedded in the file or structure of the file is known. With printable data, you are often working with built-in limitations on fields with variable-length encoding - and there is no longer length encoding information. For example, the Unix password file format is a text encoding with variable length; This uses ':' to isolate the field. You can not tell that how long an area is until you reach the next ':' or at the end of the line. This requires different handling from the encoded binary format using ASN.1 1 , where in the field a type of indicator value (usually a byte) and length (1, 2 or 4 bytes may be, depending on the type) before the actual data of the field.


1 is considered as very complex in ASN.1 (proper) form; I have given a very simple example of how it is used which can be criticized at many levels. Even so, the basic idea is valid - the length (and usually with ASN.1, also type) (binary) happens before the data, it is also known as - type, length, value - encoding.


Comments

Popular posts from this blog

asp.net - Javascript/DOM Why is does my form not support submit()? -

sockets - Delphi: TTcpServer, connection reset when reading -

javascript - Classic ASP "ExecuteGlobal" statement acting differently on two servers -