How To Create EMR Notebook In Amazon EMR Studio
How to Make EMR Notebook?
Amazon Web Services (AWS) has incorporated Amazon EMR Notebooks into Amazon EMR Studio Workspaces on the new Amazon EMR interface. Integration aims to provide a single environment for notebook creation and massive data processing. However, the new console's “Create Workspace” button usually creates notebooks.
Users must visit the Amazon EMR console at the supplied web URL and complete the previous console's procedures to create an EMR notebook. Users usually select “Notebooks” and “Create notebook” from this interface.
When creating a Notebook, users choose a name and a description. The next critical step is connecting the notebook to an Amazon EMR cluster to run the code.
There are two basic ways users associate clusters:
Select an existing cluster
If an appropriate EMR cluster is already operating, users can click “Choose,” select it from a list, and click “Choose cluster” to confirm. EMR Notebooks have cluster requirements, per documentation. These prerequisites, EMR release versions, and security problems are detailed in specialised sections.
Create a cluster
Users can also “Create a cluster” to have Amazon EMR create a laptop-specific cluster. This method lets users name their clusters. This workflow defaults to the latest supported EMR release version and essential apps like Hadoop, Spark, and Livy, however some configuration variables, such as the Release version and pre-selected apps, may not be modifiable.
Users can customise instance parameters by selecting EC2 Instance and entering the appropriate number of instances. A primary node and core nodes are identified. The instance type determines the maximum number of notebooks that can connect to the cluster, subject to constraints.
The EC2 instance profile and EMR role, which users can choose custom or default roles for, are also defined during cluster setup. Links to more information about these service roles are supplied. An EC2 key pair for cluster instance SSH connections can also be chosen.
Amazon EMR versions 5.30.0 and 6.1.0 and later allow optional but helpful auto-termination. For inactivity, users can click the box to shut down the cluster automatically. Users can specify security groups for the primary instance and notebook client instance, use default security groups, or use custom ones from the cluster's VPC.
Cluster settings and notebook-specific configuration are part of notebook creation. Choose a custom or default AWS Service Role for the notebook client instance. The Amazon S3 Notebook location will store the notebook file. If no bucket or folder exists, Amazon EMR can create one, or users can choose their own. A folder with the Notebook ID and NotebookName and.ipynb extension is created in the S3 location to store the notebook file.
If an encrypted Amazon S3 location is used, the Service role for EMR Notebooks (EMR_Notebooks_DefaultRole) must be set up as a key user for the AWS KMS key used for encryption. To add key users to key policies, see AWS KMS documentation and support pages.
Users can link a Git-based repository to a notebook in Amazon EMR. After selecting “Git repository” and “Choose repository”, pick from the list.
Finally, notebook users can add Tags as key-value pairs. The documentation includes an Important Note about a default tag with the key creatorUserID and the value set to the user's IAM user ID. Users should not change or delete this tag, which is automatically applied for access control, because IAM policies can use it. After configuring all options, clicking “Create Notebook” finishes notebook creation.
Users should note that these instructions are for the old console, while the new console now uses EMR Notebooks as EMR Studio Workspaces. To access existing notebooks as Workspaces or create new ones using the “Create Workspace” option in the new UI, EMR Notebooks users need extra IAM role rights. Users should not change or delete the notebook's default access control tag, which contains the creator's user ID. No notebooks can be created with the Amazon EMR API or CLI.
The thorough construction instructions in some current literature match the console interface, however this transition symbolises AWS's intention to centralise notebook creation in EMR Studio.












