How To Sync a Linux Directory with an Amazon S3 Bucket

Learn how to synchronize a Linux directory with an Amazon S3 bucket in this blog and video tutorial
  1. Blog

In one of our previous posts, we taught you how to sync a Windows folder with an AWS S3 bucket. This time, we'll show you how to do that with a Linux directory.

I'm going to assume you already have a JSCAPE MFT Server agent running on your remote Linux host and that you've set up the agent service and agent trading partner on JSCAPE MFT Server. If you haven't done that yet, just click the links and follow the instructions there.

Alright. I'm now here in the directory named 'syncdir' on my remote Linux host. If you recall, this is the directory we picked to synchronize a.k.a. the 'root path' when we configured our agent in the post Install MFT Server Agents on Linux Hosts. Note that there are no files in this directory yet. We'll be populating that later with files from an S3 bucket folder upon synchronization.

sync linux directory with amazon s3 bucket - 01

Speaking of which, here's our Amazon S3 bucket folder named 'folder1'. It's found inside the bucket named 'jscapejohn'. Notice that there are three files in there. Once we sync this folder with our Linux directory, those files will be copied over there.

sync linux directory with amazon s3 bucket - 02

Let's now head over to our JSCAPE MFT Server instance. If we go to Overview > Sessions, we can see an agent session for the agent running in our remote Linux host. We can identify that from the IP address of my Linux host, which is found under the Client column.

sync linux directory with amazon s3 bucket - 03

In the Trading Partners module, I also have two trading partners. One for the agent in the Linux host and another for the Amazon S3 bucket.

If you don't have an Amazon S3 trading partner on hand yet, I suggest you read the post: How To Connect and Upload Files To an Amazon S3 Trading Partner.

Read the section 'Creating the Amazon S3 Trading Partner'.

sync linux directory with amazon s3 bucket - 04

Let's now proceed to create the trigger that would synchronize specified directories or folders in those two trading partners.

Go to the Triggers module and click the Add button.

sync linux directory with amazon s3 bucket - 05

For those who are using JSCAPE MFT Server version 12.1 and higher, you'll be presented with a Trigger Template dialog where you can choose a template that best describes the workflow you're about to automate. Let's just leave that blank and click OK.

sync linux directory with amazon s3 bucket - 06

Give the trigger a name, for example 'sync linux directory with s3 bucket', and the select the Current Time event type. You may select any event type that suits your needs but, for this example, I'd like this trigger to run at a certain time of the day. That's why I'm using the Current Time event type.

Click Next to proceed.

sync linux directory with amazon s3 bucket - 07

In the next screen, you can use the Expression Builder to build an expression that would define the schedule when this particular trigger should fire. I want this trigger to execute at 8:30 AM everyday, so I built the expression as shown on the screen.

Recommended read: Introducing the New Trigger Conditions Expression Builder



REMINDER: If you'd like to test JSCAPE's features, you're welcome to request a free trial.
Trial


Click Next to proceed.

sync linux directory with amazon s3 bucket - 08

Once you get to the next screen, click the Add button to add a new trigger action and then select Trading Partner Synchronization from the Action drop-down list and then click OK.

sync linux directory with amazon s3 bucket - 09

Let me now walk you through the key settings for this Trading Partner Synchronization action.

The first ones you'll encounter are PartnerA and PartnerB. Usually, PartnerA is the source. But because this is a synchronization process, which is a two-way process, the concept of a source and target is irrelevant.

Anyway, let's just select our Linux host agent trading partner for PartnerA and our Amazon S3 trading partner for PartnerB.

Next up are PathA and PathB. PathA is the relative path in PartnerA that will be involved in the synchronization process and PathB is the relative path on PartnerB that PathA will be synchronizing with. Of course, those paths should already exist in the respective trading partners.

The forward slash (/) in PathA simply points to the root folder of our designated 'Root path' for the Linux host agent. If you recall, its absolute path on that Linux host is the directory with the path /home/ec2-user/syncdir.

For PathB, 'jscapejohn' is just the S3 bucket and 'folder1' is just the folder inside that bucket.

sync linux directory with amazon s3 bucket - 10

The next setting we need to specify is the Copy Condition. This is the condition JSCAPE MFT Server will use to determine whether to commence copying (or synchronizing) files each time the predefined schedule of this trigger is up.

If you select:

  • different time, JSCAPE MFT Server will commence copying if it sees that file timestamps on A are different from the ones on B;
  • different size, JSCAPE MFT Server will commence copying if it sees that file sizes on A are different from the ones on B;
  • different content, JSCAPE MFT Server will commence copying if it sees that the content in A is different from the content in B

Let's just choose different time for now.

Another setting you need to specify is the Synchronization Mode. There are four options:

  • mirror - New and modified files from A are copied to B; redundant files in B will be deleted;
  • synchronize - New and modified files from both paths are copied to each other
  • backup - All files from A are copied to B
  • contribute - New and modified files from A are copied to B

As you might have guessed, we'll be using synchronize for this example.

Lastly, you need to specify the Result Directory. This is where the results of the synchronization process will be written to.

sync linux directory with amazon s3 bucket - 11

Click OK and then, when you get to the outer screen, drag an arrow from the Start output of the Workflow node to the Trading Partner Synchronization Action node.

Recommended read: Introducing the Redesigned Trigger Action Workflow

Click OK to finalize the trigger creation process.

sync linux directory with amazon s3 bucket - 12

sync linux directory with amazon s3 bucket - 13

While preparing this tutorial, I realized 8:30AM was going to be a long wait. Since this is just an example, I eventually decided to change the trigger condition to 8:01 AM. As you can see from the screenshot, it's already 8:03 AM. That means, the trigger has already fired.

sync linux directory with amazon s3 bucket - 14

You can verify if in fact a trigger has fired by going to the History tab of the Triggers module.

Here, you can see that our trigger did fire at 8:01 AM.

sync linux directory with amazon s3 bucket - 15

And if we look in the 'syncdir' directory in our Linux host, we see that the files from our Amazon S3 bucket folder have indeed been copied over.

sync linux directory with amazon s3 bucket - 16

That's it. Now you know how to configure JSCAPE MFT Server so that you can synchronize a remote Linux folder to an Amazon S3 bucket folder.

Give JSCAPE MFT Server Agents a test run

Get started with your free JSCAPE MFT Server here.