Questions tagged with Amazon EMR
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I have an EMR cluster and I have used the treasure data connector to read data from table into dataframe using pyspark. Now these tables that I'm trying to read have approximately 100 million to 500...
1
answers
0
votes
100
views
asked 3 days agolg...
Issue: PySpark works in the first cells (likely SparkSession creation) but throws import errors when using my Python files in later cells.
Environment: AWS EMR ( Amazon EMR...
0
answers
0
votes
118
views
asked 9 days agolg...
Let me know if this is something AWS EMR Studio does:
1. in Databricks community edition, and in Google Collab, one can fire up a simple Jupyter notrebook with an automatically started cluster (small...
1
answers
0
votes
137
views
asked 15 days agolg...
Hi everyone,
I am using AWS EMR to do some ETL operations on very large datasets (like millions/billions of records). I am using PySpark and reading the csv files using *spark.read.csv*. The results...
1
answers
0
votes
197
views
asked 17 days agolg...
While running the serverless job run, I am getting below errror:
"Number of cores specified by 'spark.driver.cores '7' is invalid".
2
answers
0
votes
201
views
asked 19 days agolg...
Hi
I have a EMR with Hbase on S3 storage mode.I have a read replica cluster pointing to same S3 bucket.
Now when I add record in primary cluster and flush table on primary, and then run refresh_hfiles...
1
answers
0
votes
226
views
asked 21 days agolg...
Hi
I am getting error while launching EMR with Hbase as S3Storage and WAL backup enabled .
Caused by: java.lang.RuntimeException: createWal failed for wal WALMetadata(WALWorkspace=testworkspace2,...
0
answers
0
votes
325
views
asked 21 days agolg...
I have a Python package saved in CodeCommit and need to use it in the notebook attached to my EMR cluster workspace.
The package is already successfully installed via bootstrap.
To do this, in my .sh...
0
answers
0
votes
265
views
asked a month agolg...
I have a Serverless EMR appication, I am submitting a spark job via python script. I have packaged all the dependencies an an the script to an s3 bucket. When I execute the job the spark job is...
2
answers
0
votes
296
views
asked a month agolg...
Hello,
I configured iceberg formatted table with transaction in hive on EMR 6.4.1. When I insert data into the table, the operation get stuck, without any error.
Any insights are highly...
Accepted AnswerAmazon EMR
1
answers
0
votes
305
views
asked a month agolg...
I've started seeing the following error on JupyterHub on EMR
`TypeError: required field "type_ignores" missing from Module`
from the simplest commands
![the...
2
answers
0
votes
300
views
asked a month agolg...
Hi Team,
We have EMR 6.10 cluster where flink jobs submitted to existing application. Container was running in task node in my case. Then I resized the task instance group from 1 to 0 in task instance...
Accepted AnswerAmazon EMR
1
answers
0
votes
272
views
asked 2 months agolg...