DMS
EndPoint - Connection error for S3 source endpoint - Test Endpoint failed: Application-Status: 1020912, Application-Message: Failed to connect to database.
Root Cause: DMS Replication Instance and the S3 bucket were not in the same region.
Migration process -
1) create source & target endpoints and test the connection
2) create migration task using the endpoints (map columns as required) and run it.
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.S3.html
ignoreHeaderRows=1 to ignore header in data
Load Task completed - but no data loaded >> look at the "Bucket folder" value provided in the endpoint --> it should point until the <schema>/<table>/data.csv path.
So, if the sub dirs for schema & table are directly under the bucket - keep it empty
https://aws.amazon.com/premiumsupport/knowledge-center/dms-task-successful-no-data-s3/
Taxi_trips
https://www.kaggle.com/divineunited/exploring-the-chicago-taxi-trip-dataset
https://data.cityofchicago.org/Transportation/Taxi-Trips/wrvz-psew/data
Data types etc: https://docs.aws.amazon.com/dms/latest/userguide/dms-ug.pdf
CloudWatch logs not written - says Log Group not available or log stream not available - why ?
For DMS to write CloudWatch logs - there should be a Role named "dms-cloudwatch-logs-role" with "AmazonDMSCloudWatchLogsRole" policy attached to it. If its not there, create manually
Refer: https://aws.amazon.com/premiumsupport/knowledge-center/dms-cloudwatch-logs-not-appearing/
Important Note:
"ColumnNullable" is true by default even if not mentioned in the Schema JSON. Which means, if any column value is empty/null, job will fail at that point.
Data is loaded to RDS one by one (committed one by one sequnetially) - So even if job fails on data error on a particular row, all rows till then will be loaded, and remaining will be skipped.
So, analyze the data before the load - to mark all nullable columns with schema attribute
"ColumnNullable":"true"
Replication Instance, EndPoints & Replication Tasks
S3 Data Dir Structure
EndPoint settings
Extra connection attributes : bucketName=srees-data-bucket-2;cdcPath=undefined;compressionType=NONE;csvDelimiter=,;csvRowDelimiter=\n;datePartitionEnabled=false;ignoreHeaderRows=1;
Table structure (JSON)
{"TableCount":"1","Tables": [{"TableName":"taxi_trips","TablePath":"publicdata/taxi_trips_202007/","TableOwner":"publicdata","TableColumns": [{"ColumnName":"trip_id","ColumnType":"STRING","ColumnNullable":"false","ColumnIsPk":"true","ColumnLength":"50"},{"ColumnName":"taxi_id","ColumnType":"STRING","ColumnNullable":"false","ColumnIsPk":"false","ColumnLength":"150"
}, ...................................................
{
"ColumnName":"trip_miles",
"ColumnType":"NUMERIC",
"ColumnPrecision":"6", --> TOTAL SIZE
"ColumnScale":"2", --> #DECIMAL PLACES
"ColumnNullable":"true"
},
.....................................................
{ "ColumnName":"dropoff_location",
"ColumnType":"STRING",
"ColumnLength":"30",
"ColumnNullable":"true"
}
],
"TableColumnsTotal":"23"
}
]
}
Cloud Watch Logs - sample
AWS API Gateway
Issue faced while invoking it via JQuery:
Access to XMLHttpRequest at 'https://dummyCode.execute-api.us-west-1.amazonaws.com/dev/contactMeToCallSES' from origin 'null' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource.
Enable it in API Gateway -
Choose Enable CORS from the Actions drop-down menu.
CORS is required to call your API from a webpage that isn’t hosted on the same domain. To enable CORS for a REST API, set theAccess-Control-Allow-Origin
header in the response object that you return from your function code.
Access to XMLHttpRequest at 'https://2491zdgh7a.execute-api.us-west-1.amazonaws.com/dev/contactMeToCallSES' from origin 'null' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
***** Problem I faced with APIGateway-Lambda integration - requests were failing *****
API Test from UI (and from postman) was keep on failing.
The difference b/w the working (old account) & the non working version of my API was that - the value of Endpoint request body after transformations was having all headers and other params in the failing request , instead of the actual JSON request body
Sun Feb 14 18:31:38 UTC 2021 : Endpoint request body after transformations: {"resource":"/contactMeToCallSES","path":"/contactMeToCallSES","httpMethod":"POST","headers":null,"multiValueHeaders":null,"queryStringParameters":null,"multiValueQueryStringParameters":null,"pathParameters":null,"stageVariables":null,"requestContext":{"resourceId":"t90fph","resourcePath":"/contactMeToCallSES","httpMethod":"POST","extendedRequestId":"av3lqHDCSK4FtNA=","requestTime":"14/Feb/2021:18:31:38 +0000","path":"/contactMeToCallSES","accountId
........... Sun Feb 14 18:31:39 UTC 2021 : Endpoint response body before transformations: null Sun Feb 14 18:31:39 UTC 2021 : Execution failed due to configuration error: Malformed Lambda proxy response Sun Feb 14 18:31:39 UTC 2021 : Method completed with status: 502
In the working version the value of Endpoint request body after transformations was as below --> Integration Request Type is LAMBDA
Sun Feb 14 18:57:09 UTC 2021 : Endpoint request body after transformations: { "firstname": "fname", "lastname": "lasNm", "phone": 1231231231, "email": "email", "desc": "testing from connect4wree"} Sun Feb 14 18:57:09 UTC 2021 : Sending request to https://lambda.us-w
Root Cause & Solution: The Integration Type I selected was Lambda Proxy instead of Lambda
How to enable CORS for LAMBDA_PROXY Integration Type?
Set the CORS header in the response created from LAMBDA as below. API Gateway is just a pass through in case of LAMBDA_PROXY
Usage Plans for API Gateway - Throttle the requests
Lambda (Glue b/w services, Serverless-means you don't manage the server)
Pay for processing time.
Chat History Pull
Order history App
Transaction Rate Alarm
Uses
Can trigger based on time/fixed schedule - for periodic batch runs
Lambda Triggers - Many services can generate triggers to Lambda.
Language Support
Produce using SDK, KPL or Kinesis Agent
Java SDK - PutRecord (500 rec/call), GetRecord?
Consmer - SDK, KCL, FireHose, Lambda
Lambdas have a special behaviour when it comes to processing Kinesis event records. When the Lambda throws an error while processing a batch of records, it automatically retries the same batch of records. No further records from the specific shard are processed.
Lambda code editor will not support Java 8 - so write and upload the JAR https://maven.apache.org/plugins/maven-assembly-plugin/usage.html
No comments:
Post a Comment