Full disclosure: I'm relatively new to the AWS world. As the title states, I'm trying to upload a folder from my local machine to an amazon S3 volume, via JupyterLab in Sagemaker studio. I'm able to manually do this by clicking the upload icon in JupyterLab, but I'm hoping to be able to do it with the following syntax:
import sagemakerfrom sagemaker.tuner import ( IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner,)sagemaker_session = sagemaker.Session()region = sagemaker_session.boto_region_namebucket = sagemaker_session.default_bucket()prefix = "sagemaker/my-first-proj"role = sagemaker.get_execution_role()local_dir = "Users/tomi/DevProjects/WeThePeople/datasets"inputs = sagemaker_session.upload_data(path=local_dir, bucket=bucket, key_prefix=prefix)
When I run the code block above, this is the error I get:
FileNotFoundError Traceback (most recent call last)Cell In[3], line 2 1 local_dir = "Users/tomi/DevProjects/WeThePeople/datasets"----> 2 inputs = sagemaker_session.upload_data(path=local_dir, bucket=bucket, key_prefix=prefix) 3 print("input spec (in this case, just an S3 path): {}".format(inputs))File /opt/conda/lib/python3.10/site-packages/sagemaker/session.py:400, in Session.upload_data(self, path, bucket, key_prefix, extra_args) 397 s3 = self.s3_resource 399 for local_path, s3_key in files:--> 400 s3.Object(bucket, s3_key).upload_file(local_path, ExtraArgs=extra_args) 402 s3_uri = "s3://{}/{}".format(bucket, key_prefix) 403 # If a specific file was used as input (instead of a directory), we return the full S3 key 404 # of the uploaded object. This prevents unintentionally using other files under the same 405 # prefix during training.File /opt/conda/lib/python3.10/site-packages/boto3/s3/inject.py:318, in object_upload_file(self, Filename, ExtraArgs, Callback, Config) 287 def object_upload_file( 288 self, Filename, ExtraArgs=None, Callback=None, Config=None 289 ): 290 """Upload a file to an S3 object. 291 292 Usage:: (...) 316 transfer. 317 """--> 318 return self.meta.client.upload_file( 319 Filename=Filename, 320 Bucket=self.bucket_name, 321 Key=self.key, 322 ExtraArgs=ExtraArgs, 323 Callback=Callback, 324 Config=Config, 325 )File /opt/conda/lib/python3.10/site-packages/boto3/s3/inject.py:143, in upload_file(self, Filename, Bucket, Key, ExtraArgs, Callback, Config) 108 """Upload a file to an S3 object. 109 110 Usage:: (...) 140 transfer. 141 """ 142 with S3Transfer(self, Config) as transfer:--> 143 return transfer.upload_file( 144 filename=Filename, 145 bucket=Bucket, 146 key=Key, 147 extra_args=ExtraArgs, 148 callback=Callback, 149 )File /opt/conda/lib/python3.10/site-packages/boto3/s3/transfer.py:292, in S3Transfer.upload_file(self, filename, bucket, key, callback, extra_args) 288 future = self._manager.upload( 289 filename, bucket, key, extra_args, subscribers 290 ) 291 try:--> 292 future.result() 293 # If a client error was raised, add the backwards compatibility layer 294 # that raises a S3UploadFailedError. These specific errors were only 295 # ever thrown for upload_parts but now can be thrown for any related 296 # client error. 297 except ClientError as e:File /opt/conda/lib/python3.10/site-packages/s3transfer/futures.py:103, in TransferFuture.result(self) 98 def result(self): 99 try: 100 # Usually the result() method blocks until the transfer is done, 101 # however if a KeyboardInterrupt is raised we want want to exit 102 # out of this and propagate the exception.--> 103 return self._coordinator.result() 104 except KeyboardInterrupt as e: 105 self.cancel()File /opt/conda/lib/python3.10/site-packages/s3transfer/futures.py:266, in TransferCoordinator.result(self) 263 # Once done waiting, raise an exception if present or return the 264 # final result. 265 if self._exception:--> 266 raise self._exception 267 return self._resultFile /opt/conda/lib/python3.10/site-packages/s3transfer/tasks.py:269, in SubmissionTask._main(self, transfer_future, **kwargs) 265 self._transfer_coordinator.set_status_to_running() 267 # Call the submit method to start submitting tasks to execute the 268 # transfer.--> 269 self._submit(transfer_future=transfer_future, **kwargs) 270 except BaseException as e: 271 # If there was an exception raised during the submission of task 272 # there is a chance that the final task that signals if a transfer (...) 281 282 # Set the exception, that caused the process to fail. 283 self._log_and_set_exception(e)File /opt/conda/lib/python3.10/site-packages/s3transfer/upload.py:591, in UploadSubmissionTask._submit(self, client, config, osutil, request_executor, transfer_future, bandwidth_limiter) 589 # Determine the size if it was not provided 590 if transfer_future.meta.size is None:--> 591 upload_input_manager.provide_transfer_size(transfer_future) 593 # Do a multipart upload if needed, otherwise do a regular put object. 594 if not upload_input_manager.requires_multipart_upload( 595 transfer_future, config 596 ):File /opt/conda/lib/python3.10/site-packages/s3transfer/upload.py:244, in UploadFilenameInputManager.provide_transfer_size(self, transfer_future) 242 def provide_transfer_size(self, transfer_future): 243 transfer_future.meta.provide_transfer_size(--> 244 self._osutil.get_file_size(transfer_future.meta.call_args.fileobj) 245 )File /opt/conda/lib/python3.10/site-packages/s3transfer/utils.py:247, in OSUtils.get_file_size(self, filename) 246 def get_file_size(self, filename):--> 247 return os.path.getsize(filename)File /opt/conda/lib/python3.10/genericpath.py:50, in getsize(filename) 48 def getsize(filename): 49 """Return the size of a file, reported by os.stat()."""---> 50 return os.stat(filename).st_sizeFileNotFoundError: [Errno 2] No such file or directory: 'Users/tomi/DevProjects/WeThePeople/datasets'
I know for certain however that this path exists on my machine. If I go into terminal, I'm clearly able to access the directory as seen below:
>>> (WeThePeople) tomi@MacBook-Pro-4 datasets % pwd/Users/tomi/DevProjects/WeThePeople/datasets
I thought perhaps this might be an IAM permission thing on AWS, but the user profile I'm using sagemaker with has both AmazonS3FullAccess
and AmazonSageMakerFullAccess
. Not sure if this is the issue but just thought I'd mention.
Question is, what could be causing this issue and how do I resolve it? Could it be some other permissions settings? Is there something else I haven't checked?