Skip to content

Commit 6f03fe7

Browse files
committed
Version 1.0.15
Making RAM share acceptance more explicit Fixing producer IAM policy Improved documentation for producer rights after creating data product
1 parent a1218ff commit 6f03fe7

File tree

5 files changed

+74
-16
lines changed

5 files changed

+74
-16
lines changed

README.md

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,6 @@ You can also use [examples/0\_5\_setup\_account\_as.py](examples/0_5_setup_accou
204204
Accounts can be both producers and consumers, so you may wish to run this step against the account used above. You may also have Accounts that are Consumer only, and cannot create data shares. This step is only run once per AWS Account and must be run using credentials that have AdministratorAccess as well as being Lake Formation Data Lake Admin:
205205

206206
```python
207-
import logging
208207
from data_mesh_util.lib.constants import *
209208
from data_mesh_util import DataMeshMacros as data_mesh_macros
210209

@@ -244,7 +243,6 @@ The above Steps 1.1 and 1.2 can be run for any number of accounts that you requi
244243
Creating a data product replicates Glue Catalog metadata from the Producer's account into the Data Mesh account, while leaving the source storage at rest within the Producer. The data mesh objects are shared back to the Producer account to enable local control without accessing the data mesh. Data Products can be created from Glue Catalog Databases or one-or-more Tables, but all permissions are managed at Table level. Producers can run this as many times as they require. To create a data product:
245244

246245
```python
247-
import logging
248246
from data_mesh_util import DataMeshProducer as dmp
249247

250248
data_mesh_account = 'insert data mesh account number here'
@@ -285,12 +283,13 @@ data_mesh_producer.create_data_products(
285283

286284
You can also use [examples/1\_create\_data\_product.py](examples/1_create_data_product.py) as an example to build your own application.
287285

286+
Please note that upon creation of a data product, you will see a new Database and Table created in the Data Mesh Account, and this Database and Table have been shared back to the producer AWS Account using Resource Access Manager (RAM). Your producer Account may now be able to query data both from within the data mesh and from their own account, but the security Principal used for Data Mesh Utils may require additional permissions to use Athena or other query services.
287+
288288
### Step 3: Request access to a Data Product Table
289289

290290
As a consumer, you can gain view public metadata by assuming the `DataMeshReadOnly` role in the mesh account. You can then create an access request for data products using:
291291

292292
```python
293-
import logging
294293
from data_mesh_util import DataMeshConsumer as dmc
295294

296295
data_mesh_account = 'insert data mesh account number here'
@@ -303,7 +302,6 @@ consumer_credentials = {
303302
}
304303
data_mesh_consumer = dmp.DataMeshConsumer(
305304
data_mesh_account_id=data_mesh_account,
306-
log_level=logging.DEBUG,
307305
region_name=aws_region,
308306
use_credentials=consumer_credentials
309307
)
@@ -329,7 +327,6 @@ You can also use [examples/2\_consumer\_request\_access.py](examples/2_consumer_
329327
In this step, you will grant permissions to the Consumer who has requested access:
330328

331329
```python
332-
import logging
333330
from data_mesh_util import DataMeshProducer as dmp
334331

335332
data_mesh_account = 'insert data mesh account number here'
@@ -342,7 +339,6 @@ producer_credentials = {
342339
}
343340
data_mesh_producer = dmp.DataMeshProducer(
344341
data_mesh_account_id=data_mesh_account,
345-
log_level=logging.DEBUG,
346342
region_name=aws_region,
347343
use_credentials=producer_credentials
348344
)
@@ -381,7 +377,6 @@ You can also use [examples/3\_grant\_data\_product\_access.py](examples/3_grant_
381377
Permissions have been granted, but the Consumer must allow those grants to be imported into their account:
382378

383379
```python
384-
import logging
385380
from data_mesh_util import DataMeshConsumer as dmc
386381

387382
data_mesh_account = 'insert data mesh account number here'
@@ -394,7 +389,6 @@ consumer_credentials = {
394389
}
395390
data_mesh_consumer = dmp.DataMeshConsumer(
396391
data_mesh_account_id=data_mesh_account,
397-
log_level=logging.DEBUG,
398392
region_name=aws_region,
399393
use_credentials=consumer_credentials
400394
)

setup.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[metadata]
22
name = aws-data-mesh-utils
3-
version = 1.0.14
3+
version = 1.0.15
44
author = Ian Meyers
55
author_email = [email protected]
66
license = Apache 2.0

src/data_mesh_util/DataMeshConsumer.py

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -136,13 +136,30 @@ def finalize_subscription(self, subscription_id: str) -> None:
136136
source_account=self._data_mesh_account_id
137137
)
138138

139-
self._consumer_automator.accept_pending_lf_resource_shares(
140-
sender_account=self._data_mesh_account_id
141-
)
139+
shares = []
140+
for k, v in subscription.get(RAM_SHARES).items():
141+
shares.append(v.get('arn'))
142+
143+
# accept the RAM shares attached to the subscription
144+
accepted, active, not_found = self._consumer_automator.accept_lf_resource_shares(
145+
share_list=shares)
146+
147+
if len(accepted) > 0:
148+
self._logger.info(f"Accepted {len(accepted)} RAM Shares: {str(accepted)}")
149+
150+
if len(active) > 0:
151+
self._logger.info(
152+
f"{len(active)} RAM Shares already in Active state")
153+
154+
if len(not_found) > 0:
155+
self._logger.warning(
156+
f"Unable to resolve {len(not_found)} RAM Shares: {str(not_found)}")
142157

158+
# mark the subscription as finalized
143159
self._subscription_tracker.mark_subscription_as_imported(
144160
subscription_id=subscription_id
145161
)
162+
self._logger.info(f"Subscription Import Complete")
146163

147164
def get_subscription(self, request_id: str) -> dict:
148165
return self._subscription_tracker.get_subscription(subscription_id=request_id)

src/data_mesh_util/lib/ApiAutomator.py

Lines changed: 49 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,13 @@ def configure_iam(self, policy_name: str, policy_desc: str, policy_template: str
219219

220220
self._logger.debug("Waiting for User to be ready for inclusion in AssumeRolePolicy")
221221

222+
# attach the data access policy to the group
223+
iam_client.attach_group_policy(
224+
GroupName=group_name,
225+
PolicyArn=policy_arn
226+
)
227+
self._logger.info(f"Attached Policy {policy_arn} to Group {group_name}")
228+
222229
role_created = False
223230
retries = 0
224231
while role_created is False and retries < 5:
@@ -265,7 +272,7 @@ def configure_iam(self, policy_name: str, policy_desc: str, policy_template: str
265272
PolicyArn=policy_arn
266273
)
267274
policy_attached = True
268-
self._logger.info(f"Attached Policy {policy_arn} to {role_name}")
275+
self._logger.info(f"Attached Policy {policy_arn} to Role {role_name}")
269276
except iam_client.exceptions.MalformedPolicyDocumentException as mpde:
270277
if "Invalid principal" in str(mpde):
271278
# this is raised when something within IAM hasn't yet propagated correctly.
@@ -1147,14 +1154,54 @@ def add_bucket_policy_entry(self, principal_account: str, access_path: str):
11471154
# put the policy back into the bucket store
11481155
s3_client.put_bucket_policy(Bucket=bucket_name, Policy=json.dumps(new_policy))
11491156

1157+
def accept_lf_resource_shares(self, share_list: list) -> tuple:
1158+
'''
1159+
Causes a list of RAM shares to be accepted by the caller
1160+
:param share_list:
1161+
:return:
1162+
'''
1163+
ram_client = self._get_client('ram')
1164+
1165+
get_response = ram_client.get_resource_share_invitations()
1166+
1167+
# only accept peding lakeformation shares from the source account
1168+
shares_accepted = []
1169+
shares_active = []
1170+
shares_not_found = []
1171+
for r in get_response.get('resourceShareInvitations'):
1172+
share_arn = r.get('resourceShareArn')
1173+
if share_arn in share_list:
1174+
if r.get('status') == 'PENDING':
1175+
ram_client.accept_resource_share_invitation(
1176+
resourceShareInvitationArn=r.get('resourceShareInvitationArn')
1177+
)
1178+
shares_accepted.append(share_arn)
1179+
self._logger.info(f"Accepted RAM Share {share_arn}")
1180+
elif r.get('status') == 'ACCEPTED':
1181+
shares_active.append(share_arn)
1182+
else:
1183+
shares_not_found.append(share_arn)
1184+
else:
1185+
shares_not_found.append(share_arn)
1186+
1187+
return shares_accepted, shares_active, shares_not_found
1188+
11501189
def accept_pending_lf_resource_shares(self, sender_account: str, filter_resource_arn: str = None):
1190+
'''
1191+
Accepts all pending resource shares of the specified type from the sending account. Used to automatically accept
1192+
shares from the data mesh back to the producer
1193+
1194+
:param sender_account:
1195+
:param filter_resource_arn:
1196+
:return:
1197+
'''
11511198
ram_client = self._get_client('ram')
11521199

11531200
get_response = ram_client.get_resource_share_invitations()
11541201

11551202
accepted_share = False
11561203
for r in get_response.get('resourceShareInvitations'):
1157-
# only accept peding lakeformation shares from the source account
1204+
# only accept pending lakeformation shares from the source account
11581205
if r.get('senderAccountId') == sender_account and 'LakeFormation' in r.get('resourceShareName') and r.get(
11591206
'status') == 'PENDING':
11601207
if filter_resource_arn is None or r.get('resourceShareArn') == filter_resource_arn:

src/data_mesh_util/resource/producer_account_policy.pystache

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,10 @@
2121
"glue:GetDatabase",
2222
"glue:CreateDatabase",
2323
"glue:CreateTable",
24-
"glue:CreateTables",
2524
"glue:CreateCrawler",
2625
"glue:Update*",
27-
"glue:TagResource"
26+
"glue:TagResource",
27+
"glue:SearchTables"
2828
],
2929
"Resource": "*"
3030
},

0 commit comments

Comments
 (0)