UTFCast Pro User Manual

What is a warning

A warning is a message that appears when detecting or converting files. It is not an error, but indicates that the detection engine is uncertain about the type or codepage of the file.

The process of detecting the codepage is based on data statistics and is never 100% accurate. UTFCast Pro uses four different detection engines to improve the accuracy of the detection result. However, if the file is too small or contains insufficient text, it may be difficult to determine the data type. In such cases, a warning is displayed.

Files with a warning status are not converted. If you have selected the option to copy unconverted files, these files will be copied to the output directory without being converted. You can use the preview panel to verify the accuracy of the detection result. If the result is correct, you can ignore the warning and convert the file using the "Accept Result" function. Alternatively, you can specify a different codepage for conversion using the "Make Correction" function.

File name filters

Wildcard filter

A wildcard is a symbol used to represent one or more unknown characters in a file name. UTFCast Pro supports two wildcard symbols: the asterisk (*) for any number of unknown characters, and the question mark (?) for only one unknown character.

You can combine wildcard symbols by using the semicolon (;) separator. If a file name does not match any of the specified wildcard strings, it will be ignored.

Examples

To only pick file names that start with the letter 'W' and end with the '.TXT' extension, use the wildcard string:

w*.txt

To only pick file names that have two characters and the '.PHP' extension, use the wildcard string:

??.php

To only pick file names that have two or three characters and end with any extension, use the wildcard string:

??.*; ???.*

Regular expression filter

UTFCast Pro supports the ECMA Script (ECMA-262) regular expression format. For more information on the ECMA Script standard, please visit the ECMA website or download the specification document from http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf.

Example

To pick file names that are numeric and have a '.TXT' extension, use the following regular expression: '\d+.txt'

\d+\.txt

Settings

In most cases, UTFCast Pro works effectively with its default settings. However, for large-sized files or when converting a large number of small-sized files, you may experience varying performance. The settings options allow you to optimize the converter parameters for your specific needs and enhance its performance on your system. Additionally, the settings options let you modify the default behaviors during Instant Conversion and the default parameters on the Custom Conversion dialog.

Default Settings

Changes to the following settings will affect the behavior of both the Instant Conversion and the default settings of the Custom Conversion.

Copy unconverted files

By default, Instant Conversion does not copy files that are not converted to the output directory. If you want to change this behavior, you can do so in this setting. This setting can also affect the "Copy Unconverted" setting in the Custom Conversion function.

Write BOM to converted files

If you want Instant Conversion to include a BOM (Byte Order Mark) in the outputted files, set this setting to "On." This setting can also affect the BOM setting in the Custom Conversion function.

Process hidden files

When this option is turned off, hidden files and hidden folders in the source folder will be ignored.

Output encoding

Specify the encoding for Instant Conversion, In-Place Conversion, and the default setting for Custom Conversion.

Return type

Specify the return type for Instant Conversion, In-Place Conversion, and the default setting for Custom Conversion.

Advanced Settings

Converter Threads

Converter Threads setting determines the number of parallel processing threads that UTFCast Pro will use when converting files. By default, this setting is auto-detected and set to the number of Logical CPUs on your system. The purpose of this setting is to maximize the conversion speed and fine-tune the system resource usage.

In general, increasing the number of Converter Threads will result in faster conversions, but also higher system resource usage. Conversely, decreasing the number of Converter Threads will result in lower system resource usage, but also slower conversions. The optimal setting will depend on the overall performance of your system, especially the hard disk drive and memory performance. In most cases, leaving this setting to its default value will be sufficient.

Chunk Size

A chunk is a series of memory spaces that UTFCast uses to store the contents of a file. Using a smaller chunk size results in more hard disk input/output operations for a file, but it leads to faster processing speed for memory operations and less memory usage. On the other hand, a larger chunk size results in the opposite effect. However, if you use a very large chunk to process a small file, it can waste time allocating unnecessary memory space. If you need to convert large files frequently, it is advisable to increase the chunk size. For example, if you need to convert 1000 files, each 10MB in size, you can set the chunk size to its maximum value, which will reduce hard disk input/output operations and increase performance.

Sample Size

The Sample Size setting is used to determine the code page of a file. Using a larger sample size increases the accuracy of the code page detection, but also uses more memory and takes longer to complete the detection. Note that this setting has no impact on files that are smaller than the specified sample size.

Use codepage GB18030 instead of GB2312

See the section: GB18030 Support

Binary List Acceleration

The binary list is a list of file extensions that are known to be binary files. Examples include .exe files, .rar files, .zip files, .pdf files, etc. Detecting these files can take a significant amount of time and the detection results may not be as precise. By enabling this setting, the detection process for these files can be skipped, thus increasing the detection speed. However, if you have text files that are being ignored by the detection engine, you should turn off this option. Otherwise, keeping this option enabled can help improve the detection speed.

Use Consolidated Buffer

It is recommended to keep this option enabled, unless you encounter issues with memory usage during conversions. The optimal buffer size depends on the memory performance of your system and the number of files you are converting at the same time. It's best to leave the buffer size at the default setting for most cases, but if you encounter memory issues, you can try adjusting the buffer size.

Codepage Reference

GB18030 Support

GB18030 is a separate standard used in the People's Republic of China for encoding Chinese characters superseding GB2312. In GB18030, characters can be 1, 2, or 4 bytes. GB18030 support is turned on by default.

Supported Codepages

UTFCast Pro supports detecting and reading the below input codepages:

  • US-ASCII
  • Big5
  • EUC-JP (EUC 20932 subset)
  • EUC-KR
  • EUC-TW
  • GB18030
  • GB2312
  • HZ-GB2312
  • IBM855
  • IBM866
  • ISO-2022-CN
  • ISO-2022-JP / JIS
  • ISO-2022-KR
  • ISO-8859-2
  • ISO-8859-5
  • ISO-8859-7
  • ISO-8859-8
  • KOI8-R
  • Shift-JIS
  • UCS-4-2143
  • UCS-4-3412
  • UTF-16 Big Endian
  • UTF-16 Little Endian
  • UTF-32 Big Endian
  • UTF-32 Little Endian
  • UTF-8
  • Windows-1250 / ISO-8859-1 / Latin-2
  • Windows-1251
  • Windows-1252 / ANSI / Latin-1
  • Windows-1253
  • Windows-1255
  • Windows-874 / TIS 620
  • MAC-Cyrillic / x-mac-cyrillic

Codepage Identifiers

A codepage identifier is a number for UTFCast Pro to identify which codepage you are referring to. It is usually used with the /cp command line switch. For example:

UTFCastPro.exe /in:"C:\My Files" /out:"D:\My Output" /cp:1252

The below table shows the complete list of supported codepage identifiers.

Codepage Name Codepage Identifier
BIG5 950
EUC-JP (EUC 20932 subset) 20932
EUC-KR 51949
EUC-TW 51950
GB2312 936
GB18030 54936
HZ-GB-2312 52936
IBM855 855
IBM866 866
ISO-2022-JP / JIS 50222
ISO-2022-KR 50225
ISO-2022-CN 50227
ISO-8859-2 28592
ISO-8859-5 28595
ISO-8859-7 28597
ISO-8859-8 28598
KOI8-R 20866
MAC-Cyrillic / x-mac-cyrillic 10007
Shif-JIS 932
UCS-4-3412 3412
UCS-4-2143 2143
UTF-8 65001
UTF-16LE 1200
UTF-16BE 1201
UTF-32LE 12000
UTF-32BE 12001
Windows-874 /TIS 620 874
Windows-1250 / ISO-8859-1 / Latin-2 1250
Windows-1251 1251
Windows-1252 / ANSI / Latin-1 1252
Windows-1253 1253
Windows-1255 1255

Supported output Return-Types (Also known as CR/LF Style)

  • No change
  • Force CRLF (Windows Style)
  • Force CR Only (Macintosh Style)
  • Force LF Only (Unix/Linux Style)

Command Line Reference

Command Line Modes

UTFCast Pro supports two command line modes, one is the Windows Application Command Line Mode, and the other one is the Console Command Line Mode.

The Windows Application Command Line Mode of UTFCast Pro supports two modes: GUI mode and Quiet mode. In GUI mode, the current status of a session is displayed and the user can interact with the interface to pause or stop the session. In Quiet mode, no interface is displayed and the program can work with a System Service in the background.

The Console Command Line Mode provides a text-only interface that runs in a console window, such as a System Command Line Prompt Window, or a Powershell Window.

Most features provided in both modes are the same, but they run in different contexts and may provide different conveniences in different situations.

Console Command Line Mode

UTFCast Pro's Console Command Line Mode is a separate application that can be found in UTFCast Pro's installation folder as UTFCastCon.exe.

To use this command line mode, you need to open a Windows Command Line Prompt Window or a Powershell Window and run UTFCastCon.exe inside that window. UTFCast Pro's installer automatically adds its installation folder to your system's PATH environment variable, so you do not need to type the full path to UTFCast Pro's install folder when running console command line mode. However, if UTFCast Pro's installation folder is not included in the PATH environment variable for any reason, it is recommended that you manually add it for convenience.

Command Line Syntax

The command line syntax for UTFCast Pro is:

UTFCastPro.exe /switch:argument /switch /switch:"argument contains space characters"

The command line syntax for UTFCast Console is:

UTFCastCon.exe /switch:argument /switch /switch:"argument contains space characters"

Switches And Arguments

The table below lists all switches and their corresponding available arguments. Both switches and arguments are case insensitive. If an argument contains at least one space character, the argument must be enclosed in a pair of double quotes.

Switch Argument Description Comment
/in "A path to a folder or a file" Specify which folder or file to input
/out "A path to a folder or a file" Specify which folder or file to output

In DIR mode, if this switch is not present, a sibling folder name will be generated. For example, a folder named Source_Folder (Converted).

In FILE mode, this switch must be present.

If any part of the output path does not exist, a corresponding folder will be created. If /d switch is present, this switch is ignored.

/r Recursive conversion
/c   Copy unconverted files  
/h   Process hidden files If this command is not specified, hidden files and hidden folders in source folder will be ignored.
/quiet   Quiet mode Suppress all messages and user interactions. To record any error, detection result or conversion result in quiet mode, please use in combination with /logfile and /export switches.
/mode DIR The Source is a folder The Output must be a folder. If this switch is not present, DIR mode is assumed.
FILE The Source is a file The Output must be a file.
BACHUITE The Source is a Bachuite file  
/enc UTF8 Convert files to UTF-8 UTF-8 is assumed if this command is not present.
UTF16 Equivalent to UTF16LE
UTF16LE Convert files to UTF-16 Little Endian
UTF16BE Convert files to UTF-16 Big Endian
UTF32 Equivalent to UTF32LE
UTF32LE Convert files to UTF-32 Little Endian
UTF32BE Convert files to UTF-32 Big Endian
2143 Convert files to UCS-4-2143
3412 Convert files to UCS-4-3412
/bom YES Write a BOM to a converted file A BOM will be written if this command is not present.
  NO Do not write a BOM to a converted file
/rt CR Set return type to CR (Macintosh) Return type will not be changed if this command is not present.
LF Set return type to LF (Unix)
CRLF Set return type to CRLF (Windows)
NOCHANGE Do not change return type
/wf "A wild card string" Apply wildcard filter If both wf and rf are present, wf is used, unless its value is set to empty.
/rf "A regular expression" Apply regular expression filter
/cp A codepage identifier Skip auto-detection and manually specify codepage decoder If the source file is a Unicode text file with a BOM, the Codepage Identifier is ignored. Refer to the Codepage Identifiers section for the full list of available identifiers.
/logfile "A path to a log file" Write debug messages to specified file If the log file is in a system folder or a folder that needs additional privileges to access, make sure UTFCast Pro is running with the required privileges, or run UTFCast Pro with administrator, otherwise logging will fail.
/d   Detection only

Detect the file (in file mode) or directory (in dir mode) provided with /in switch.

If this switch is present, the /out switch is ignored.

To record detection result, use in combination with /export switch.

The Console Application Command Line Mode also outputs the detection result to the console window.

/export "Path to a CSV file" Export detection or conversion result to a CSV file= By default, the exported result file is in UTF-8. To specify a different encoding, use with /exportenc switch.
/exportenc UTF8 Encode the exported result file in UTF-8 UTF-8 is assumed if this switch is not present.
UTF16 Equivalent to UTF16LE
UTF16LE Encode the exported result file in UTF-16 Little Endian
UTF16BE Encode the exported result file in UTF-16 Big Endian
UTF32 Equivalent to UTF32LE
UTF32LE Encode the exported result file in UTF-32 Little Endian
UTF32BE Encode the exported result file in UTF-32 Big Endian
/exportbom YES Add a BOM to the exported result file If this switch is not present, a BOM will be added.
NO Do not add a BOM to the exported result file
/resetlayout   Reset the GUI layout data to its initial state If you cannot reach to some GUI elements due to a resolution change of your monitor settings, or have problems after installing a different version of UTFCast Pro, please try resetting the layout.
/cmdfile "A path to a command line file" Read the command line from a text file instead Windows has the 260-character path length limit. To pass a very long command line to UTFCast Pro, you can use a Command Line File. See the details in Using Command Line File.
/ver   Show version info If this switch is present, other switches are ignored.
/register   Prompt to enter license information Use this switch to change license information without showing the main GUI.
/clearlicense   Delete license information from system If you need to move your license to a new PC, you should clear your license information from the old one.

Using Command Line File

Windows has a limit of 260 characters for the length of file paths, which means that you may not be able to access a file, directory, or run a command line if its total length exceeds 260 characters. UTFCast Pro provides various command line switches and arguments to control the command line mode, some of which may require a path to a file or a directory. If you combine multiple switches in the command line mode, it is possible that your command line will exceed the 260-character limit. Additionally, Windows does not allow you to pass a very long path as an argument to UTFCast Pro in command line mode.

To address this issue, UTFCast Pro introduced a Command Line File feature in version 2.8. This file is a simple text file containing the command line switches and arguments. As you can store very long text in a text file, UTFCast Pro can read the command line from the text file up to a maximum length of 32768 characters.

Using the Command Line File is easy. Simply store all of your command line switches and arguments in the first line of the text file and pass the /cmdfile switch with an argument pointing to the command line file. UTFCast Pro will take care of the rest. For example:

UTFCastPro.exe /cmdfile:"C:\My UTFCast Command Line.txt"

And now in your C:\My UTFCast Command Line.txt can contain a full command line in the first line like the below example (note that the keyword UTFCastPro.exe or UTFCastCon.exe must not be in the command line file):

/in:"C:\A Very Very Very Long Long Long Path That Causes The Command Line to Exceed 260 characters\Input.txt" /out:"C:\Another Very Very Very Long Long Long Path That Causes The Command Line to Exceed 260 characters\Output.txt" /mode:file /bom:yes /enc:utf8 /export:"D:\My UTFCast Logs\Today.log"

Command Line Examples

Here are some examples of how to use UTFCast Pro in command line mode:

Example 1: To convert every text file in C:\MyFolder (including files in subfolders but excluding hidden files and hidden folders) and save the converted files to D:\MyOutput as UTF-16BE without BOM encoding, the command line is:

UTFCastPro.exe /in:"C:\MyFolder" /out:"D:\MyOutput" /r /enc:utf16be /bom:no

Example 2: To convert the file C:\MyFile.txt to C:\MyConvertedFile.txt as UTF-8 with BOM encoding (skipping auto-detection) and manually specifying the Windows-1252 decoder to read the source file, the command line is:

UTFCastPro.exe /in:"C:\MyFile.txt" /out:"D:\MyConvertedFile.txt" /enc:utf8 /bom:yes /mode:file /cp:1252

Logging Example: To convert every text file in C:\MyFiles (including files in subfolders but excluding hidden files and hidden folders) and save the converted files to D:\MyConvertedFiles as UTF-8 with BOM encoding (skipping auto-detection) and manually specifying the Windows-1252 decoder to read the source file, while also logging the conversion process to a log file located at D:\UTFCastPro.log, the command line is:

UTFCastPro.exe /in:"C:\MyFiles" /out:"D:\MyConvertedFiles" /enc:utf8 /bom:yes /mode:file /cp:1252 /logfile:"D:\UTFCastPro.log"

Bachuite Reference

Merging multiple tasks

Merging multiple tasks is a useful technique that can simplify repetitive tasks and reduce the number of commands required to accomplish them. In the given example, the task is to convert multiple files and folders to a different encoding format and save them in a new location. Without merging, this would require running the command line multiple times, once for each task.

UTFCastPro.exe /in:"D:\My Files" /out:"D:\My Output" /enc:utf8 /rt:crlf /bom:YES

However, by using Bachuite and wrapping the command lines to Bachuite XML, the tasks can be merged into a single command, allowing them to be executed with just one command. The XML elements in the Bachuite code correspond to the different tasks that need to be performed, and the attributes within each element correspond to the specific parameters needed for that task.

<dir in="D:\My Files" out="D:\My Output" enc="utf8" rt="crlf" bom="yes"/>

If you want to convert multiple sibling folders and some single files with the command line, you'll need to run the command line multiple times, one time for each task:

          
UTFCastPro.exe /in:"D:\My Files A" /out:"D:\My Output A" /enc:UTF8 /bom:YES /rt:CRLF
UTFCastPro.exe /in:"D:\My Files B" /out:"D:\My Output B" /enc:UTF8 /bom:NO /rt:CRLF
UTFCastPro.exe /in:"D:\Single File A.txt" /out:"D:\Single File Output A.txt" /enc:UTF16LE /mode:FILE /rt:CRLF /bom:YES
UTFCastPro.exe /in:"D:\Single File B.txt" /out:"D:\Single File Output B.txt" /enc:UTF16LE /mode:FILE /rt:CRLF /bom:NO
UTFCastPro.exe /in:"D:\Single File C.txt" /out:"D:\Single File Output C.txt" /enc:UTF16LE /mode:FILE /rt:CRLF /bom:YES
          
        

By merging tasks in this way, the overall process becomes more efficient:

          
<dir in="D:\My Files A" out="D:\My Output A" enc="utf8" bom="yes" rt="crlf"/>
<dir in="D:\My Files B" out="D:\My Output B" enc="utf8" bom="no" rt="crlf"/>
<file in="D:\Single File A.txt" out="D:\Single File Output A.txt" enc="utf16le" rt="crlf" bom="yes" />
<file in="D:\Single File B.txt" out="D:\Single File Output B.txt" enc="utf16le" rt="crlf" bom="no" />
<file in="D:\Single File C.txt" out="D:\Single File Output C.txt" enc="utf16le" rt="crlf" bom="yes" />
          
        

In fact, simple wrapping is just one of the options. Bachuite can do multiple tasks with ease by using Sets.

Using Sets

Sets allow you to define a group of attributes that can be inherited by child elements within the set. This makes it easy to apply a set of common attributes to multiple tasks without having to repeat the same attributes for each task.

          
<set rt="crlf" bom="yes" enc="utf8">
 
  <!-- The below tasks inherit rt, bom and enc from the parent set -->
  <dir in="D:\My Files A" out="D:\My Output A" />
  <dir in="D:\My Files B" out="D:\My Output B" bom="no" />
 
    <!-- A child set inherits properties too -->
    <set enc="utf16le">
      <file in="D:\Single File A.txt" out="D:\Single File Output A.txt" />
      <file in="D:\Single File B.txt" out="D:\Single File Output B.txt" bom="no"/>
      <file in="D:\Single File C.txt" out="D:\Single File Output C.txt" />
    </set>
 
</set>
          
        

In the above example, the parent set defines the rt, bom, and enc attributes, which are then inherited by the child elements within the set. The first two dir elements inherit all three attributes from the parent set, while the third file element inherits rt and bom but overrides enc with its own value.

This approach can simplify your XML code and make it easier to manage and maintain, especially when dealing with large sets of similar tasks.

When a link element is used, the Bachuite processor will load the referenced file and include its contents in the current Bachuite document. This allows you to reuse common elements across multiple Bachuite files and keep your code more organized. Here's an example:ple:

          
<link src="D:\MyBachuite1.xml" />
<link src="MyBachuite2.xml" enc="utf32be" bom="no" />
          
        

The src attribute specifies the path to the file to be linked, which can be either an absolute path or a relative path to the current Bachuite file. Any attributes specified on the link element will override those of the linked file.

Becareful using links as the linked file must exist and be valid for the whole task to run.

Using Profiles

A profile is a set of predefined attributes that can be reused in other elements. A profile element is similar to a set element but has a name and cannot have children. When a profile is defined, it covers the scope of any sibling elements and their children, and only elements in the covered scope can access it. If an element is assigned an existing profile, it does not inherit any attribute from its parent; instead, it inherits the profile's parent attribute and clones all attributes from the profile. Bachuite applies all attributes of the profile to the element first and then applies explicitly presented attributes.

A profile can be assigned to another, so you can have a profile that includes another profile as well as specific attribute overrides.

Profiles can also be overridden by explicitly presented attributes. For example, if you have a profile with bom="yes" but then explicitly set bom="no" on an element, that element will have bom="no".

It's important to note that profiles have a scope. Any element outside of the scope of a profile cannot access it. In the below example given, the element on line 22 will result in an error because it is outside of the scope of the "with_bom" profile. Overall, profiles can be a powerful tool for organizing and reusing attributes in Bachuite.

Note that Bachuite profile elements are similar to the setting profile feature in the GUI but are designed for and work in different environments. It is not possible to load saved GUI setting profiles or Bachuite profiles in each other.

Here's an example of using profiles:

          
<set enc="utf8" in="D:\Text Files">
    <!-- A profile also inherits attributes from its parent, just like other elements. -->
    <!-- The below profiles also have the enc attribute set to "utf8" and the in attribute set to "D:\Text Files" even these attributes are not explicitly presented. -->
    <profile name="with_bom" bom="yes" rt="crlf" />
    <profile name="without_bom" bom="no" rt="crlf" />
 
    <!-- All elements below here and their children can use the two profiles defined above -->
    <dir out="D:\Output with bom" profile="with_bom" />
    <dir out="D:\Output without bom" profile="without_bom" />
 
    <!-- A profile can also be assigned to a set, a link, or even another profile -->
    <!-- Explicitly setting an attribute value (enc="utf16" in this example) overrides it -->
    <profile name="different_enc" enc="utf16" out="D:\Profile Out" profile="without_bom" />
 
    <set out="D:\New Output">
        <!-- Because a profile is assigned, this element does not inherit any attribute from its parent, the out attribute value "D:\Profile Out" which is copied from the profile is used -->
        <dir profile="different_enc" />
    </set>
</set>
<set enc="utf16le">
    <!-- ERROR, the below element is out of the "with_bom" profile's scope -->
    <dir in="D:\Text Files" out="D:\Output with bom" profile="with_bom" />
</set>
          
        

Resolving absolute and relative paths

Any path in an element can be an absolute path like C:\MyFile.txt, or a relative path like: MyFile.txt or ..\MyFile.txt. If a path is a relative one, it will be resolved to the relative location of the current Bachuite file. For example:

In C:\First.xml:

          
<link src="D:\Second.xml" />
<link src="Third.xml" />
<file in="MyFile.txt" out="SubDir\MyFileOutput.txt" />
          
        

When linking to Third.xml in C:\First.xml, the path of Third.xml is resolved to C:\Third.xml.

The same thing applies to paths in other elements. In C:\First.xml, MyFile.txt and MyFileOutput.txt are resolved to C:\MyFile.txt and C:\SubDir\MyFileOutput.txt.

Running Bachuite

The Bachuite XML must be saved as an XML file. Its content is nothing more than a normal XML file with the Bachuite root element and the Bachuite XML schema. For example, save the below XML to D:\MyBachuite.xml:

          
<?xml version="1.0" encoding="UTF-8"?>
<bachuite version="1.0">
 
  <!-- Your Bachuite XML goes here -->
 
</bachuite>
          
        

Run the Bachuite file using the below command line:

UTFCastPro.exe /in:"D:\MyBachuite.xml" /mode:bachuite

Bachuite Attributes

All available attributes are listed in the below table. Elements and Attributes are case sensitive, however, Attribute Values are case insensitive.

Supported Elements Attribute Value Description Comment
set, file, dir, link, profile in A Path to a file or a folder Specify which folder or file to input  
out A Path to a file or a folder Specify which folder or file to output

In a dir element, if this attribute is not present, or its value is empty, a sibling folder name will be generated. For example, a folder named Source_Folder (Converted).

In a file element, this attribute must have a value.

If any part of the output path does not exist, a corresponding folder will be created.

r YES Recursive conversion. NO is assumed if the attribute is not present.
NO Non-recursive conversion.
c YES Copy unconverted files. NO is assumed if the attribute is not present.
NO Ignore unconverted files.
h YES Process hidden files. NO is assumed if the attribute is not present.
NO Do not process hidden files.
enc UTF8 Convert to UTF-8 UTF8 is assumed if the attribute is not present.
UTF16LE Convert to UTF-16 Little Endian.
UTF16BE Convert to UTF-16 Big Endian.
UTF32LE Convert to UTF-32 Little Endian.
UTF32BE Convert to UTF-32 Big Endian.
2143 Convert to UCS-4-2143.
3412 Convert to UCS-4-3412.
bom YES Write a BOM to a converted file. YES is assumed if the attribute is not present.
NO Do not write a BOM to a converted file.
rt CR Set return type to CR (Macintosh) NOCHANGE is assumed if the attribute is not present.
LF Set return type to LF (Unix)
CRLF Set return type to CRLF (Windows)
NOCHANGE Do not change return type
cp A codepage identifier Skip auto-detection and manually specify codepage decoder If the source file is a Unicode text file with a BOM, the Codepage Identifier is ignored. Refer to the Codepage Identifiers section for the full list of available identifiers.
wf A wildcard string Apply wildcard filter

If both wf and rf are present, wf is assumed unless its value is set to empty.

To disable filters, either set both values to empty, or do not provide any of them in the element or any parent elements.

rf A regular expression Apply regular expression filter
profile A profile name Assign an attribute profile to the element

Assigning an attribute profile to an element clones all attribute values (including implicit and explicit attribute values) from the profile.

Only profiles that are defined in the same or parent scope can be accessed by the current element.

link src A path to a Bachuite file Link to an external Bachuite file Linking to a non-existent Bachuite file can make the whole Bachuite refuses to run.
profile name A unique element name Define a profile The profile can be accessed by all sibling elements and their children.