87

I want to use the AWS S3 cli to copy a full directory structure to an S3 bucket.

So far, everything I've tried copies the files to the bucket, but the directory structure is collapsed. (to say it another way, each file is copied into the root directory of the bucket)

The command I use is:

aws s3 cp --recursive ./logdata/ s3://bucketname/

I've also tried leaving off the trailing slash on my source designation (ie, the copy from argument). I've also used a wildcard to designate all files ... each thing I try simply copies the log files into the root directory of the bucket.

agentv
  • 1,060

8 Answers8

98

I believe sync is the method you want. Try this instead:

aws s3 sync ./logdata s3://bucketname/
Chad Smith
  • 1,589
40

The following worked for me:

aws s3 cp ~/this_directory s3://bucketname/this_directory --recursive

AWS will then "make" this_directory and copy all of the local contents into it.

17

I had faced this error while using either of these commands.

$ aws s3 cp --recursive /local/dir s3://s3bucket/
OR
$ aws s3 sync /local/dir s3://s3bucket/

I even thought of mounting the S3 bucket locally and then run rsync, even that failed (or got hung for few hours) as I have thousands of file.

Finally, s3cmd worked like a charm.

s3cmd sync /local/dir/ --delete-removed s3://s3bucket/ --exclude="some_file" --exclude="*directory*"  --progress --no-preserve

This not only does the job well and shows quite a verbose output on the console, but also uploads big files in parts.

vikas027
  • 1,249
7

(Improving the solution of Shishir)

  • Save the following script in a file (I named the file s3Copy.sh)
path=$1 # the path of the directory where the files and directories that need to be copied are located
s3Dir=$2 # the s3 bucket path

for entry in "$path"/*; do
    name=`echo $entry | sed 's/.*\///'`  # getting the name of the file or directory
    if [[ -d  $entry ]]; then  # if it is a directory
        aws s3 cp  --recursive "$name" "$s3Dir/$name/"
    else  # if it is a file
        aws s3 cp "$name" "$s3Dir/"
    fi
done
  • Run it as follows:
    /PATH/TO/s3Copy.sh /PATH/TO/ROOT/DIR/OF/SOURCE/FILESandDIRS PATH/OF/S3/BUCKET
    For example if s3Copy.sh is stored in the home directory and I want to copy all the files and directories located in the current directory, then I run this:
    ~/s3Copy.sh . s3://XXX/myBucket

You can easily modify the script to allow for other arguments of s3 cp such as --include, --exclude, ...

LoMaPh
  • 202
3

Use the following script for copying folder structure:

s3Folder="s3://xyz.abc.com/asdf";

for entry in "$asset_directory"*
do
    echo "Processing - $entry"
    if [[ -d  $entry ]]; then
        echo "directory"
        aws s3 cp  --recursive "./$entry" "$s3Folder/$entry/"
    else
        echo "file"
        aws s3 cp "./$entry" "$s3Folder/"
    fi
done
3

I couldn't get s3 sync or s3 cp to work on a 55 GB folder with thousands of files and over 2 dozen subdirectories inside. Trying to sync the whole folder would just cause awscli to fail silently without uploading anything to the bucket.

Ended up doing this to first sync all subdirectories and their contents (folder structure is preserved):

nice find . -mindepth 1 -maxdepth 1 -type d | cut -c 3- | while read line; do aws s3 sync $"$line" "s3://bucketname/$line"; done

Then I did this to get the 30,000 files in the top level:

nice find . -mindepth 1 -maxdepth 1 -type f | cut -c 3- | while read line; do aws s3 cp "$line" "s3://bucketname/";

Make sure to watch the load on the server (protip you can use w to just show the load) and ctrl-z to suspend the command if load gets too high. (fg to continue it again).

Putting this here in case it helps anyone in a similar situation.

Notes:

-mindepth 1 excludes .

-maxdepth 1 prevents find from listing contents of sub-directories, since s3 sync handles those successfully.

cut -c 3- removes the "./" from the beginning of each result from find.

twhitney
  • 133
  • 6
2

This works for me.. aws s3 sync mydir s3://rahuls-bucket/mydir

brahul
  • 21
1

Alternatively you could also try minio client aka mc

$ mc cp Desktop/test/test/test.txt s3/miniocloud/Desktop/test/test/

Hope it help.

PS: I am one of the contributor to the project.