Binary dump file format specifications
A binary file is a computer file that is not a text file. Binary files are usually thought of as being a sequence of byteswhich means the binary digits bits are grouped in eights. Binary files typically contain bytes that are intended to be interpreted as something other than text characters. Compiled computer programs are typical examples; indeed, compiled applications are sometimes referred to, particularly by programmers, as binaries. But binary files can also mean that they contain images, sounds, compressed versions of other files, etc.
Some binary files contain headersblocks of metadata used by a computer program to interpret the data in the file. The header often contains a signature or magic number which can identify the format. For example, a GIF file can contain multiple images, and headers are used to identify and describe each block of image data.
If a binary file does not contain any headers, it may be called a flat binary file. To send binary files through certain systems such as binary dump file format specifications that do not allow all data values, they are often translated into a plain text representation using, for example, Base The increased size may be countered by lower-level link compression, as the resulting text data will have about as much less entropy as it has increased size, so the actual data transferred in this scenario would likely be very close to the size of the original binary data.
See Binary-to-text encoding for more on this subject. A hex editor or viewer may be used to view file data as a binary dump file format specifications of hexadecimal or decimal, binary or ASCII character values for corresponding bytes of a binary file.
If a binary file is opened in a text editoreach group of eight bits will typically be translated as a single character, and the user will see a probably unintelligible display of textual characters. If the file is opened in some other application, that application will have its own use for each byte: Other type binary dump file format specifications viewers called 'word extractors' simply replace the unprintable characters with spaces revealing only the human-readable text.
This type of view is useful for quick inspection of a binary file in order to find passwords in games, find hidden text in non-text files and recover corrupted documents.
If the file is itself treated as an executable and run, then the operating system will attempt to interpret the file as a series of instructions in its machine language. Standards are very important to binary files. For example, a binary file interpreted by the ASCII character set will result in text being displayed.
A custom application can interpret the file differently: Binary itself is meaningless, until such time as an executed algorithm defines what should be done with each bit, byte, word or block. Thus, just examining the binary and attempting to match it against known formats can lead to the wrong conclusion as to what it actually represents. This fact can be used in steganographywhere an algorithm interprets a binary data file differently to reveal binary dump file format specifications content.
Without the algorithm, it is impossible to tell that hidden content exists. Two files that are binary compatible will have binary dump file format specifications same sequence of zeros and ones in the data portion of the file. The file header, however, may be different. The term is used most commonly to state that data files produced by one application are exactly the same as data files produced by another application. For example, some software companies produce applications for Windows and the Macintosh that are binary compatible, which means that a file produced in a Windows environment is interchangeable with a file produced on a Macintosh.
This avoids many of the conversion problems caused by importing and exporting data. One possible binary compatibility issue between different computers is the endianness of the computer.
Some computers store the bytes in a file in a different order. From Wikipedia, the free encyclopedia. For double stars, see Binary star.
For the CD image format, see Disk image. This article does not cite any sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
April Learn how and when to remove this template message. For binary code executable file compatibility, see Binary compatible. Open Close Read Write. File comparison File compression File manager Comparison of file managers File system permissions File transfer File binary dump file format specifications File verification.
A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a binary dump file format specifications storage medium. File formats may be either proprietary or free and may be either unpublished or open. Some file formats are designed for very particular types of data: PNG files, for example, store bitmapped images using lossless data compression.
Other file formats, binary dump file format specifications, are designed for storage of several different types of data: A text file can contain any stream of characters, including possible control charactersand is encoded in one of various character encoding schemes. Some file formats, such as HTMLscalable vector graphicsand the source code of computer software are text files with defined syntaxes that allow them to be used for specific purposes. File formats often have a published specification describing the encoding method and enabling testing of program intended functionality.
If the developer of a format doesn't publish free specifications, another developer looking to utilize that kind of file must either reverse engineer the file to find out how to read it or acquire the specification document from the format's developers for a fee and by signing a non-disclosure agreement.
The latter approach is possible only binary dump file format specifications a formal specification document exists. Both strategies require significant time, money, or both; therefore, file formats with publicly available specifications tend to be supported by more programs.
Patent law, rather than copyrightis more often used to protect a file format. Although patents for file formats are not directly permitted under US law, some formats encode data using patented algorithms.
For example, using compression with the GIF file format requires the binary dump file format specifications of a patented algorithm, and though the patent owner did not initially enforce their patent, they later began collecting royalty fees.
This has resulted in a significant decrease in the use of GIFs, and is partly responsible for the development of the alternative PNG format. However, the patent expired in the US in mid, and worldwide in mid Different operating systems binary dump file format specifications traditionally taken different approaches to determining a particular file's format, with each approach having its own advantages and disadvantages.
Most modern operating systems and individual applications need to use all of the following approaches to read "foreign" file formats, if not work with them completely. This portion of the filename is known as the filename extension. For binary dump file format specifications, HTML documents binary dump file format specifications identified by names that end with.
In the original FAT filesystemfile names were limited to an eight-character identifier and a three-character extension, known as an 8.
There are only so many three-letter extensions, so, often any given extension might be linked to more than one program. Many formats still use three-character extensions even though modern operating systems and application programs no longer have this limitation. Since there is no standard list of extensions, more than one format can use the same extension, which can confuse both the operating system and users. One artifact of this approach is that the system can easily be tricked into treating a file as a different format simply by renaming it—an HTML file can, for instance, be easily treated as plain text by renaming it from filename.
Although this strategy was useful to expert users who could easily understand and manipulate this information, it was often confusing to less technical users, who could accidentally make a file unusable or "lose" it by renaming it incorrectly. This led more recent operating system shellssuch as Windows 95 and Mac OS X, to hide the extension when listing files.
This prevents the user from accidentally changing the file type, and allows expert users to turn this feature off and display the extensions. Hiding the extension, however, can create the binary dump file format specifications of two or more identical filenames in the same folder. For example, a company logo may be needed both in. With the extensions visible, these would appear as the unique filenames " CompanyLogo. On the other hand, hiding the extensions would make both appear as " CompanyLogo ".
Hiding extensions can also pose a security risk. However, the operating system would still see the ". The same is true with files with only one extension: Extensions can be spoofed.
Some Word macro viruses create a Word file in template format and save it with a. Since Word generally ignores extensions and looks at the format of the file these would open as templates, execute, and spread the virus. To further trick users, it is possible to store an icon inside the program, in which case some operating systems' icon assignment for the executable binary dump file format specifications. This issue requires users with extensions hidden to be vigilant and never let the operating system choose with what program to open a file not known to be trustworthy which contradicts the idea of making things easier for the user.
This represents a practical problem for Windows systems where extension-hiding is turned on by default. A second way to identify a file format is to use information regarding the format stored inside the file itself, either information meant for this purpose or binary strings that happen to always be in specific locations in files of some formats. Since the easiest place to locate them is at the beginning, such area is usually called a file header when it is greater than a few bytesor a magic number if it is just a few bytes long.
The metadata contained in a file header are usually binary dump file format specifications at the start of the file, but might be present in other areas too, often including the end, depending on the file format or the type of data contained.
Character-based text files usually have character-based headers, whereas binary formats usually have binary headers, although this is not a rule. Text-based file headers usually take up more space, but being human-readable, they can easily be examined by using simple software such as a text editor or a hexadecimal editor.
As well as identifying the file format, file headers may contain metadata about the file and its contents. For example, most image files store information about image format, size, resolution and color spaceand optionally authoring information such as who made the image, when and where it was made, what camera model and photographic settings were used Exifand so on.
Such metadata may be used by software reading or interpreting the file during the loading process and afterwards. File headers may be used by an operating system to quickly gather information about a file without loading it all into memory, but doing so uses more of a computer's resources than reading directly from the directory information. For instance, when a graphic file manager has to display the contents of a folder, it must read the headers of many files before it can display the appropriate icons, but these will be located in different places on the storage medium thus taking longer to access.
A folder containing many files with complex metadata such as thumbnail information may require considerable time before it can be displayed. If a header is binary hard-coded such that the header itself needs complex interpretation in order to be recognized, especially for metadata content binary dump file format specifications sake, there is a risk that the file format can be misinterpreted.
It may even have been badly binary dump file format specifications at the source. This can result in corrupt metadata which, in extremely bad cases, binary dump file format specifications even render binary dump file format specifications file unreadable. A more complex example of file headers are those used for wrapper or container file formats. One way to incorporate file type metadata, often associated with Unix and its derivatives, is just to store a "magic number" inside the file itself.
Originally, this term was used for a specific set of 2-byte identifiers at the beginnings of files, but since any binary sequence can be regarded as a number, any feature of a file format which uniquely distinguishes it can be used for identification.
Many file types, especially plain-text files, are harder to spot by this method. The magic number approach offers better guarantees that the format will be identified correctly, and can often determine more precise information about the binary dump file format specifications.
Since reasonably reliable "magic number" tests can be fairly complex, and each file must effectively be tested against every possibility in the magic database, this binary dump file format specifications is relatively inefficient, especially for displaying large lists of files in contrast, file name and metadata-based methods need check only one piece of data, and match it against a sorted index.
Also, data must be read from the file itself, increasing latency as opposed to metadata stored in the directory. Where file types don't lend themselves to recognition in this way, the system must fall back to metadata. It is, however, the best way for a program to check if the file it has been told to process is of the correct format: On the other hand, a valid magic number does not guarantee that the file is not corrupt or is of a correct type. So-called shebang lines in script files are a special case of magic numbers.
Here, the magic number is human-readable text that identifies a specific command interpreter and options to be passed to the command interpreter. Another operating system binary dump file format specifications magic numbers is AmigaOSbinary dump file format specifications magic numbers were called "Magic Cookies" and were adopted as a standard system to recognize executables in Hunk executable file format and also to let single programs, tools and utilities deal automatically with their saved data files, or any other kind of file types when saving and loading data.
This system was then enhanced with the Amiga standard Datatype recognition system. A final way of storing the format of a file is to explicitly store information about the format in the file system, rather than within the file itself. This approach keeps the metadata separate from both the binary dump file format specifications data and the name, but is also less portable than either file extensions or "magic numbers", binary dump file format specifications the format has to be converted from filesystem to filesystem.
While this is also true to an extent with filename extensions—for instance, for compatibility with MS-DOS's three character limit—most forms of storage have a roughly equivalent definition of a file's data and name, but may have varying or no representation of further metadata.
Note that zip files or archive files solve the problem of handling metadata. The new file is also compressed and possibly encrypted, but now is transmissible as a single file across operating systems by FTP systems or attached to email. At the destination, it must be unzipped by a compatible utility to be useful, but the problems of transmission are solved this way. These codes are referred to as OSTypes. These codes could be any 4-byte sequence, but were often selected so that the ASCII representation formed a sequence of meaningful characters, such as an abbreviation of the application's name or the developer's initials.
The type code specifies the format of the file, while the creator code specifies the default program to open it with when double-clicked by the user.
For example, the user could have several text files all with the type code of TEXTbut which each open in a different program, due to having differing creator codes.
This feature was intended so that, for example, human-readable plain-text files could be opened in a general purpose text editor, while programming or HTML code files would open in a specialized editor or IDEbut this feature was often the source of user confusion as which program would launch when the files binary dump file format specifications double-clicked was often unpredictable.
RISC OS uses a similar system, consisting of a bit number which can be looked up in a table of descriptions—e. Some common and standard types use a domain called public e. UTIs can be defined within a hierarchical structure, known as a conformance hierarchy. A UTI can exist in multiple hierarchies, which provides great flexibility. These comprise an arbitrary set of triplets with a name, a coded type for the value and a value, where the names are unique and values can be up to 64 KB long.
One such is that the ". TYPE" extended attribute is used to determine the file type. Its value comprises a list of one or more file types associated with the file, each of which is a string, such as "Plain Text" or "HTML document". Thus a file may have several types. Instead, it relies on other file forks to store meta-information in Winspecific formats.
Although not yet widely used outside of UK government and some digital preservation programmes, the PUID scheme does provide greater granularity than most alternative schemes.
MIME types are widely used in many Internet -related applications, and increasingly elsewhere, although their usage for on-disc type information is rare. These were originally intended as a way of identifying what type of file was attached to an e-mailindependent of the source and target operating systems.
There are problems with the MIME types though; several organisations and people have created their own MIME types without registering them properly with IANA, which makes the use of this standard binary dump file format specifications in some cases. File format identifiers is another, not widely used way to identify file formats according to their origin and their file category. It was created for the Description Explorer suite of software. The final part is composed of the usual file extension of the file or the international standard number of the file, padded left binary dump file format specifications zeros.
Another but less popular way to identify the file format is to examine the file contents for distinguishable patterns among file types.