FTP binary and ASCII transfer types and the case of corrupt files

FTP binary and ASCII transfer modes ensure file integrity during transfers. Misuse can corrupt files, affecting their functionality. Binary mode suits non-text files (e.g., images), while ASCII mode suits text files. Set default modes for efficiency and discover why some text files, like those using UTF-8 character encoding, may contain characters not supported by ASCII.
  1. Blog

When transferring files through the File Transfer Protocol (FTP), you sometimes need to pay attention to the type of file you transfer and the transfer mode used. When these two don’t match, you could end up with a corrupted file that doesn’t function or appear as expected.

For instance, if you transfer a text file from a Windows system to a UNIX system using binary mode, the transferred file could end up with additional characters.

FTP supports two transfer modes: binary mode and American Standard Code for Information Interchange (ASCII) mode. While most modern FTP programs are intelligent enough to determine the correct transfer mode, legacy FTP systems lack that capability. Since a wrong file transfer mode can alter the downloaded file, it’s essential to understand the concepts behind this FTP behavior.

How a wrong FTP transfer mode causes file corruption

As indicated earlier, your file transfer can behave differently when you use the wrong transfer mode. Let’s look at some specific examples to illustrate what we mean.

Binary mode FTP and unexpected line endings


Let’s say you’re transferring a text file from a Windows machine to a UNIX machine, and the original file contains the following lines:

echo "Hello, world!"

echo "This is line 2."

echo "And this is line 3."

If you use binary mode FTP, the file contents can gain additional characters after the transfer. For example, the contents may end up looking like this:

echo "Hello, world!"^M

echo "This is line 2."^M

echo "And this is line 3."^M

 


Notice the additional “^M” characters at the end of each line. This can happen because an FTP transfer set to binary mode transfers a file as is. You might be wondering — if binary mode transfers a file as is, why does the final file contain additional characters?

What you’re seeing are line-ending misrepresentations. When a text file contains multiple lines like the one above, each end-of-line (EOL) terminates with a line ending. The line ending can vary from one operating system to another. In Microsoft Windows, line endings are denoted by carriage return+line feed (CRLF) pairs. On the other hand, UNIX-based systems such as Linux, macOS, FreeBSD and AIX only use line feed (LF) as their line endings.

Line endings are present in every multi-line text file. You don’t see them because when a text editor recognizes a line ending, it displays the content accordingly. Specifically, the text editor applies line breaks where the line endings are found. However, if a text editor fails to recognize a particular line ending, it can render that line ending incorrectly.

For instance, it can replace the line ending with characters or display everything in a single line. These issues usually occur when you transfer a multi-line text file using binary mode FTP.

Let’s review the case above.

Since the file originates from a Windows system, it contains CRLF line endings. Because binary mode FTP doesn’t change these line endings, the CRLF line endings are retained even when the file reaches its destination, which is a UNIX system. Unfortunately, the UNIX system doesn’t recognize the carriage return and displays it as an “^M.”

These unexpected changes can affect the content’s readability. Worse, if you’re transferring a script file, markup file, source code file or any file that needs to maintain its structure, incorrect line endings can cause errors when parsed or interpreted.

To avoid the abovementioned issue, you should transfer multi-line text files using ASCII mode FTP. When a file is transferred via ASCII mode FTP, the sending or receiving platform makes the necessary changes.

For example, if the sending platform is Windows and the receiving platform is Linux, the sender won't make any changes. However, the receiver would remove the CRs. On the other hand, if the sender is on Linux and the receiver is on Windows, the sender would add CRs and the receiver wouldn’t do anything. The following diagram illustrates the process we just described:

 

ASCII mode FTP and corrupted files

Problems can also occur when you download an image file, executable file or any binary file in ASCII mode. For example, a user first downloads a text file in ASCII mode. After that, the user downloads an image file in the same login session without shifting to binary mode.

Below, you can see this illustrated through a hypothetical FTP command line output:

$ ftp ftp.example.com

Connected to ftp.example.com.

220 (vsFTPd 3.0.3)

Name (ftp.example.com:user): your_username

331 Please specify the password.

Password: your_password

230 Login successful.

Remote system type is UNIX.

Using binary mode to transfer files.

ftp> ascii

200 Switching to ASCII mode.

ftp> cd text_files

250 Directory successfully changed.

ftp> get example_text_file.txt

local: example_text_file.txt remote: example_text_file.txt

200 PORT command successful. Consider using PASV.

150 Opening ASCII mode data connection for example_text_file.txt (256 bytes).

226 Transfer complete.

256 bytes received in 0.00 secs (1000.00 Kbytes/sec)

ftp> cd image_files

250 Directory successfully changed.

ftp> get firefox.jpg

local: firefox.jpg remote: firefox.jpg

200 PORT command successful. Consider using PASV.

150 Opening ASCII mode data connection for firefox.jpg (1024 bytes).

226 Transfer complete.

1024 bytes received in 0.01 secs (100.00 Kbytes/sec)

ftp> bye

221 Goodbye.

If you’re finding it hard to understand what’s happening here, below is a summary of what the user is doing.
  1. Connects to the FTP server using ftp ftp.example.com
  2. Logs in with their username and password
  3. Switches to ASCII mode using the ascii command
  4. Changes to the directory containing the text file using the cd command
  5. Downloads the text file (example_text_file.txt) using the get command
  6. Changes to the directory containing the image file using the cd command
  7. Downloads the image file (firefox.jpg) using the get command without switching to binary mode
  8. Exits the FTP session using the bye command

Because most FTP clients now use binary mode by default, the user switches to ASCII mode to support the first file download, which is a text file. That’s a good call because, as discussed earlier, a binary mode FTP download can alter line endings in text files.

Unfortunately, the user forgets to shift back to binary mode before downloading the next file, which happens to be an image file. An image file is considered a binary file and must be transferred in binary mode. While the download proceeds without any hitches, the file will likely have been corrupted.

Let's prevent corrupt files. Request a product demo and we'll show you how.

For example, if the user attempts to open the file in, say, the Linux image viewer, the user may see something like this:

So, how can you tell when you should use binary or ASCII mode?

When to use FTP binary mode

As the name suggests, binary mode FTP is suitable for binary files. These are files that contain binary data or non-text data. If you attempt to open a binary file using a text editor, you won’t be able to decipher the contents. The contents will appear gibberish. Here are some of the contents of a .png image file, for example:

gibberish

The following are considered binary files and are, therefore suitable for binary transfers:

  • Image files (e.g. .jpg, .bmp, .png)
  • Sound files (e.g. .mp3, .avi, .wma)
  • Video files (eg. .flv, .mkv, .mov, .mp4)
  • Archive files (e.g. .zip, .rar, .tar)
  • Executables (.exe, .dll, .bin, .jar)
  • Other files (e.g., .doc, .xls, .pdf, etc.)

When to use FTP ASCII mode

As you might have gathered, ASCII mode FTP is suitable for transferring text files. These files contain human-readable text and can be viewed using a text editor like Notepad, TextEdit, Nano, or Pico. Unless they contain non-ASCII characters, files with the following extensions can be transferred using ASCII mode FTP without issues: .html, .php, .cgi, .js, .txt and .css.

Some text files, like those using UTF-8 character encoding, may contain characters not supported by ASCII. For example, ASCII doesn't support Japanese, Chinese or Korean characters. Text files that contain these characters are exceptions and should be transferred using binary mode. So, if you want to play it safe, transfer ASCII files using ASCII mode FTP. For the rest, use binary transfer mode.

How setting a default transfer mode helps minimize file corruption

When you regularly carry out FTP file transfers as part of a business process, chances are you already know what types of files are involved. To minimize transfer mode-related errors, you can set a default transfer mode. If most of the files involved are ASCII text files, you can set the default mode to ASCII. If most of the files are binary, you can set the default transfer mode to binary.

Managed file transfer solutions like JSCAPE MFT by Redwood Software allow you to set the default FTP transfer mode in just a few clicks.

JSCAPE MFT is a secure, platform-agnostic, multi-protocol, automation-ready file transfer solution. It supports FTP as well as several other file transfer protocols like Secure File Transfer Protocol (SFTP), Applicability Statement 2 (AS2), Hypertext Transfer Protocol (HTTP) and Odette File Transfer Protocol (OFTP). JSCAPE MFT runs on all major operating systems, including Windows, Linux, Mac OS, AIX and Solaris.

Are you interested to try it out? Try your free 7-day trial of JSCAPE MFT Server now.

Related Content

Active vs. Passive FTP Simplified: Understanding FTP Ports

How To Protect FTP Passwords From Brute Force Attacks

Understanding Key Differences Between FTP, FTPS, And SFTP

Tutorial For Working With The FTP Command Line