Create Glue Crawler

Create Glue Crawler

  1. Access the AWS Management Console.

    • Find AWS Glue.
    • Select AWS Glue.

    Create Glue Crawler

  2. In the AWS Glue interface, select Crawlers.

    Create Glue Crawler.

  3. Choose Create Crawler.

    Create Glue Crawler.

  4. In the Add Crawler interface, enter Crawler name as summitcrawler and select Next.

    Create Glue Crawler.

  5. For Add data source, select S3.

    Create Glue Crawler.

  6. Choose S3 path through Browse. You can choose the path as per your preference. Also, select Crawl new sub-folders only and Add an S3 data source.

    Create Glue Crawler.

  7. After adding the data source, select Next.

    Create Glue Crawler.

  8. For IAM role, you can either create a new role by selecting Create new IAM role or choose a pre-prepared role. Then, select Next.

    Create Glue Crawler.

  9. For Target database, perform Add database.

    Create Glue Crawler.

  10. Create a database by entering the database name as summitdb and selecting Create database.

    Create Glue Crawler.

  11. After creating the database, select the database and choose Next.

    Create Glue Crawler.

  12. Review the configuration and select Create crawler.

    Create Glue Crawler.

  13. Crawler creation successful. Then, choose Run crawler.

    Create Glue Crawler.

  14. It takes about 1 minute to initialize the crawler.

    Create Glue Crawler.

  15. Crawler run initiated successfully.

    Create Glue Crawler.

  16. After a while, the Crawler changes to the Stopping status.

    Create Glue Crawler.

  17. When you see the crawler status as Ready.

    Create Glue Crawler.

  18. In the AWS Glue interface, select Table, and you will see 2 data tables.

    Create Glue Crawler.

  19. Select the raw data table.

    Create Glue Crawler.

  20. Explore the details of the data table.

    Create Glue Crawler.