使用Terraform创建Databricks工作区

方法创建Databricks工作空间Databricks Terraform提供商以及AWS上所有必需的基础设施。上提供的说明仅适用于Databricks帐户E2版平台bob体育客户端下载.所有新的Databricks帐户和大多数现有帐户现在都是E2。如果您不确定您拥有哪种帐户类型,请与Databricks代表联系。

E2工作区的提供者初始化

本指南假设您拥有Databricks帐户所有者凭据(databricks_account_username而且databricks_account_password).你必须知道你的帐户编号(databricks_account_id).要获得它,请登录到Databricks帐户控制台https://accounts.cloud.www.neidfyre.com.您可以在控制台侧栏的左下角找到帐户ID。

本指南按原样提供,旨在为您的配置提供基础。

变量"databricks_account_username"{}变量"databricks_account_password"{}变量"databricks_account_id"{}变量"tags" {default ={}}变量"cidr_block" {default = "10.4.0.0/16"}变量"region" {default = " euu -west-1"}资源"random_string" "naming" {special = false upper = false length = 6} locals {prefix = "demo${random_string.naming. name . "结果}"}

在你面前管理工作空间,则必须创建VPC根斗cross-account作用Databricks E2工作区,主机和令牌输出.还必须初始化提供程序别名“多工作站系统”和使用提供者databricks.mws对所有databricks_mws_ *资源。提供者需要所有databricks_mws_ *资源将在环境中自己的专用Terraform模块中创建。通常这个模块也会创建VPC和IAM角色。

terraform {required_providers {databricks = {source = "databricks/databricks"}}} provider "aws" {region = var.region} //在"MWS"模式下初始化provider以提供新的工作区提供者"databricks" {alias = "MWS" host = "//www.neidfyre.com/accounts.cloud" username = var.databricks_account_username password = var. databrics_account_password}

步骤1:创建VPC

创建一个包含所有必要防火墙规则的AWS VPC。看到Customer-managed VPC有关网络的完整和最新的详细信息。已将AWS VPC注册为databricks_mws_networks资源。

Data "aws_availability_zones" "available" {} module "vpc" {source = "terraform-aws-modules/vpc/aws" version = "2.70.0" name = local。前缀cidr = var.cidr_block azs = data.aws_availability_zones.available.names tags = var.tags enable_dns_hostnames = true enable_nat_gateway = true create_igw = true public_subnets = [cidrsubnet(var. properties)]Cidr_block, 3,0)] private_subnets = [cidrsubnet(var。Cidr_block, 3, 1), cidrsubnet(var。cidr_block, 3,2)] default_security_group_egress = [{cidr_blocks = "0.0.0.0/0"}] default_security_group_ingress = [{description = "Allow all internal TCP and UDP" self = true}]} resource "databricks_mws_networks" "this" {provider = databricks. net " . "MWS account_id = var.databricks_account_id network_name = "${local. xml "。Prefix}-network" security_group_ids = [module.vpc.default_security_group_id] subnet_ids = module.vpc.default_security_group_id "Private_subnets vpc_id = module.vpc。vpc_id}

步骤2:创建根桶

为DBFS工作空间存储创建AWS S3桶,通常称为根斗.这个提供程序具有databricks_aws_bucket_policy提供必要的IAM策略模板。您的AWS S3桶必须使用databricks_mws_storage_configurations资源。

资源"aws_s3_bucket" "root_storage_bucket" {bucket = "${local. bucket"Prefix}-rootbucket" acl = "private" versioning {enabled = false} force_destroy = true tags = merge(var。标签,{Name = "${local。Prefix}-rootbucket"})}资源"aws_s3_bucket_public_access_block" "root_storage_bucket" {bucket = aws_s3_bucket.root_storage_bucket. bucket" {bucket = aws_s3_bucket.root_storage_bucket. bucket"Id ignore_public_acls = true depends_on = [aws_s3_bucket. net]Root_storage_bucket]} data "databricks_aws_bucket_policy" "this" {bucket = aws_s3_bucket.root_storage_bucket. sh "桶}资源"aws_s3_bucket_policy" "root_bucket_policy" {Bucket = aws_s3_bucket.root_storage_bucket. policy" {Bucket = aws_s3_bucket.root_storage_bucket. policy"Id policy = data.databricks_aws_bucket_policy.this。Json}资源"databricks_mws_storage_configurations" "this" {provider = databricks. Json}资源"databricks_mws_storage_configurations"MWS account_id = var.databricks_account_id bucket_name = aws_s3_bucket.root_storage_bucket。桶storage_configuration_name = "${local. "前缀}存储”}

步骤3:创建跨帐户IAM角色

跨帐户IAM角色注册到databricks_mws_credentials资源。

数据"databricks_aws_assume_role_policy" "this" {external_id = var.databricks_account_id}资源"aws_iam_role" "cross_account_role" {name = "${local. assume_role_policy"Prefix}-crossaccount" assume_role_policy = data.databricks_aws_assume_role_policy.this. crossaccount" assume_role_policy = data.databricks_aws_assume_role_policy.this. crossaccount"Json tags = var.tags} data "databricks_aws_crossaccount_policy" "this" {} resource "aws_iam_role_policy" "this" {name = "${local. txt "前缀}= aws_iam_role.cross_account_role政策”的作用。Id policy = data.databricks_aws_crossaccount_policy.this。Json}资源"databricks_mws_credentials" "this" {provider = databricks. xml "MWS account_id = var.databricks_account_id role_arn = aws_iam_role.cross_account_role。Arn credentials_name = "${local. "前缀}-creds" depends_on = [aws_iam_role_policy.]这]}

步骤4:创建Databricks E2工作空间

创建Databricks E2工作空间databricks_mws_workspaces资源。创建工作区的代码和管理工作区必须在单独的Terraform模块之间避免常见的混淆提供者databricks.mws而且提供者databricks.created_workspace.这就是您必须指定的原因databricks_host而且databricks_token以下模块中的输出。

资源"databricks_mws_workspaces" "this" {provider = databricks. workspace "MWS account_id = var.databricks_account_id aws_region = var.region workspace_name = local。前缀deployment_name = local。前缀credentials_id = databricks_mws_credentials.this。Credentials_id storage_configuration_id = databricks_mws_storage_configurations.this。Storage_configuration_id network_id = databricks_mws_networks.this.network_id} //导出供其他模块使用的主机输出"databricks_host" {value = databricks_mws_workspaces.this. value = databricks_mws_workspaces.this. info}Workspace_url} //在正常模式下初始化provider provider "databricks"{//在正常情况下,你不需要给providers提供别名alias = "created_workspace" host = databricks_mws_workspaces.this. alias = "created_workspace"workspace_url} //创建PAT令牌,用于在工作空间资源"databricks_token" " PAT " {provider = databricks. com}中提供实体。created_workspace comment = "Terraform Provisioning" lifetime_seconds = 86400} //为集成测试导出token,以在输出上运行"databricks_token" {value = databricks_token.pat。Token_value sensitive = true}

提供程序配置

使用Terraform管理Databricks工作区,对提供者使用以下配置:

提供程序"databricks" {host = module.ai。Databricks_host token = module.ai。databricks_token}