当前位置:网站首页>3. go deep into tidb: perform optimization explanation

3. go deep into tidb: perform optimization explanation

2022-06-24 03:46:00 luozhiyun

This article is based on TiDB release-5.1 Analyze , Need to use Go 1.16 Later versions Please state the source of reprint ~, This article was published at luozhiyun The blog of :https://www.luozhiyun.com/archives/592

summary

Whole SQL The execution process of includes the following parts :

  • The first is where the server receives the connection from the client ;
  • And then there's parsing SQL sentence , take SQL It can be interpreted as AST Where is the syntax tree implemented ;
  • The analysis is finished SQL after TiDB There will be SQL The optimization of the , They are the implementation of logical plan optimization 、 Physical plan optimization ;
  • Then the optimized execution plan will be executed ;
  • because TiDB All the data is stored in TiKV Of , So we need to talk to TiKV The processor sends a request ;
  • The final will be TiKV The returned data is encapsulated , Return to this SQL Requested clients ;

structure AST Grammar tree

stay handleQuery It will call cc.ctx.Parse, This code will eventually call Parser The parser's Parse Method :

For example, the following SQL sentence :

select id,name,age from student where age>1 and name='pingcap';

Will be parsed to generate a syntax tree , Then save to ast.StmtNode In this data structure .

type SelectStmt struct {
	dmlNode 
	*SelectStmtOpts 
	Distinct bool 
	From *TableRefsClause 
	Where ExprNode 
	Fields *FieldList 
	GroupBy *GroupByClause 
	Having *HavingClause
	...
}

Yes SQL If you are interested in analyzing this part, you can see this article :https://pingcap.com/zh/blog/tidb-source-code-reading-5 , It's very good , I will not go deep into .

Optimize

The optimization here consists of three parts : Build an execution plan 、 Logical planning optimization 、 Physical plan optimization ;

stay handleQuery Built in AST After the syntax tree, continue to look down , Will be called to Optimize Function performs logical optimization , structure Optimize the plan :

stay Optimize Function will continue to be called until optimize function :

func optimize(ctx context.Context, sctx sessionctx.Context, node ast.Node, is infoschema.InfoSchema) (plannercore.Plan, types.NameSlice, float64, error) {
	...
	//  initialization  PlanBuilder
	builder, _ := plannercore.NewPlanBuilder(sctx, is, hintProcessor)
 
	beginRewrite := time.Now()
	//  Build an execution plan 
	p, err := builder.Build(ctx, node)
	if err != nil {
		return nil, nil, 0, err
	}
	... 
	names := p.OutputNames() 
	//  No,  logical plan  Then return directly 
	logic, isLogicalPlan := p.(plannercore.LogicalPlan)
	if !isLogicalPlan {
		return p, names, 0, nil
	}
	... 
	beginOpt := time.Now()
	// logical plan & physical plan
	finalPlan, cost, err := plannercore.DoOptimize(ctx, sctx, builder.GetOptFlag(), logic)
	sctx.GetSessionVars().DurationOptimization = time.Since(beginOpt)
	return finalPlan, names, cost, err
}

There are two parts here , call builder.Build Build an execution plan , And call DoOptimize Perform logical optimization and physical optimization .

Build an execution plan

The execution plan is built by calling builder Of Build Method ,Build The method will be based on AST The type of tree to determine what execution plan should be built :

func (b *PlanBuilder) Build(ctx context.Context, node ast.Node) (Plan, error) {
	b.optFlag |= flagPrunColumns
	switch x := node.(type) { 
	case *ast.DeleteStmt:
		return b.buildDelete(ctx, x) 
	case *ast.InsertStmt:
		return b.buildInsert(ctx, x) 
	case *ast.SelectStmt:
		// select-into  Grammar processing 
		if x.SelectIntoOpt != nil {
			return b.buildSelectInto(ctx, x)
		}
		return b.buildSelect(ctx, x) 
	case *ast.UpdateStmt:
		return b.buildUpdate(ctx, x) 
	case ast.DDLNode:
		return b.buildDDL(ctx, x)
	...
	}
	return nil, ErrUnsupportedType.GenWithStack("Unsupported type %T", node)
}

Because there is so much code , It is impossible to analyze one by one , Let me analyze it through a simple example buildSelect What has been done inside :

For example, we now have one student surface , To check the data inside :

CREATE TABLE student
(
    id   VARCHAR(31),
    name VARCHAR(50),
    age  int,
    key id_idx (id)
);
INSERT INTO student VALUES ('pingcap001', 'pingcap', 3);

select sum(age) as total_age from student where name='pingcap' group by name;

TiDB When receiving such a query command , First, we will build a AST Grammar tree :

type SelectStmt struct { 
	From *TableRefsClause 
	Where ExprNode 
	Fields *FieldList 
	GroupBy *GroupByClause 
	Having *HavingClause 
	OrderBy *OrderByClause 
	Limit *Limit
	...
}

then buildSelect The execution plan will be constructed according to the node information of the syntax tree , The execution plan consists of the following operators :

  • DataSource : This is the data source , That is the table. , That's the top student surface ;
  • LogicalSelection: This is where The following filtering conditions ;
  • LogicalAggregation: There are mainly two parts of information , One is Group by Later fields , as well as select Fields of the aggregate function in , And functions, etc ;
  • Projection: Here is the corresponding select The fields that follow ;

They are encapsulated in a hierarchical relationship :

func (b *PlanBuilder) buildSelect(ctx context.Context, sel *ast.SelectStmt) (p LogicalPlan, err error) {
	... 
	//  structure  dataSource  operator 
	p, err = b.buildTableRefs(ctx, sel.From)
	if err != nil {
		return nil, err
	}
 

	if sel.GroupBy != nil {
		//  obtain  group by  Field of 
		p, gbyCols, err = b.resolveGbyExprs(ctx, p, sel.GroupBy, sel.Fields.Fields)
		if err != nil {
			return nil, err
		}
	}
 
	//  Build by query criteria  Logical Selection
	if sel.Where != nil {
		p, err = b.buildSelection(ctx, p, sel.Where, nil)
		if err != nil {
			return nil, err
		}
	} 
	//  check sql Is there a function 
	hasAgg := b.detectSelectAgg(sel)
	needBuildAgg := hasAgg
	if hasAgg {
		if b.buildingRecursivePartForCTE {
			return nil, ErrCTERecursiveForbidsAggregation.GenWithStackByArgs(b.genCTETableNameForError())
		}
		// len(aggFuncs) == 0  and  sel.GroupBy == nil  Express SELECT All aggregate functions in the field are actually related aggregates from the outer query , These aggregations have been established in the outer query .
		aggFuncs, totalMap = b.extractAggFuncsInSelectFields(sel.Fields.Fields) 
		if len(aggFuncs) == 0 && sel.GroupBy == nil {
			needBuildAgg = false
		}
	}
    //  according to sql Function construction in  LogicalAggregation  operator 
	if needBuildAgg {
		var aggIndexMap map[int]int
		p, aggIndexMap, err = b.buildAggregation(ctx, p, aggFuncs, gbyCols, correlatedAggMap)
		if err != nil {
			return nil, err
		}
		for agg, idx := range totalMap {
			totalMap[agg] = aggIndexMap[idx]
		}
	}

	var oldLen int 
	//  structure  Projection  Projection field 
	p, projExprs, oldLen, err = b.buildProjection(ctx, p, sel.Fields.Fields, totalMap, nil, false, sel.OrderBy != nil)
	if err != nil {
		return nil, err
	}
	//  structure  having  Conditions 
	if sel.Having != nil {
		b.curClause = havingClause
		p, err = b.buildSelection(ctx, p, sel.Having.Expr, havingMap)
		if err != nil {
			return nil, err
		}
	} 
	//  structure  LogicalSort  operator 
	if sel.OrderBy != nil {
		if b.ctx.GetSessionVars().SQLMode.HasOnlyFullGroupBy() {
			p, err = b.buildSortWithCheck(ctx, p, sel.OrderBy.Items, orderMap, windowMapper, projExprs, oldLen, sel.Distinct)
		} else {
			p, err = b.buildSort(ctx, p, sel.OrderBy.Items, orderMap, windowMapper)
		}
		if err != nil {
			return nil, err
		}
	}
	//  structure  LogicalLimit  operator 
	if sel.Limit != nil {
		p, err = b.buildLimit(p, sel.Limit)
		if err != nil {
			return nil, err
		}
	} 
	... 
	return p, nil
}

This process is very complicated , I have omitted a lot of details . Let's take a look buildTableRefs The method is how to build DataSource Operator's .

buildTableRefs First of all, according to the incoming sel.From Node determines its type :

func (b *PlanBuilder) buildResultSetNode(ctx context.Context, node ast.ResultSetNode) (p LogicalPlan, err error) {
	// Type check the incoming node 
	switch x := node.(type) {
	// join type 
	case *ast.Join:
		return b.buildJoin(ctx, x)
	// TableSourced  type 
	case *ast.TableSource:
		var isTableName bool
		switch v := x.Source.(type) {
		case *ast.SelectStmt:
			ci := b.prepareCTECheckForSubQuery()
			defer resetCTECheckForSubQuery(ci)
			p, err = b.buildSelect(ctx, v)
		case *ast.SetOprStmt:
			ci := b.prepareCTECheckForSubQuery()
			defer resetCTECheckForSubQuery(ci)
			p, err = b.buildSetOpr(ctx, v)
		case *ast.TableName:
			p, err = b.buildDataSource(ctx, v, &x.AsName)
			isTableName = true
		default:
			err = ErrUnsupportedType.GenWithStackByArgs(v)
		} 
		//  Repeat column verification 
		dupNames := make(map[string]struct{}, len(p.Schema().Columns))
		for _, name := range p.OutputNames() {
			colName := name.ColName.O
			if _, ok := dupNames[colName]; ok {
				return nil, ErrDupFieldName.GenWithStackByArgs(colName)
			}
			dupNames[colName] = struct{}{}
		}
		return p, nil
	...
	}
}

Because in sql Of from The table name can be followed 、join sentence 、 The subquery etc. , So here we will make some judgments according to different situations . In our example above ,sql Relatively simple from Only the watch was followed , So first of all, it will come to buildJoin Judge if you have done join , If not, it will go directly to ast.TableSource In the branch , And then call buildDataSource Method .

func (b *PlanBuilder) buildDataSource(ctx context.Context, tn *ast.TableName, asName *model.CIStr) (LogicalPlan, error) {
	dbName := tn.Schema
	sessionVars := b.ctx.GetSessionVars()

	//  Get the metadata of the table from the cache according to the table name 
	tbl, err := b.is.TableByName(dbName, tn.Name)
	if err != nil {
		return nil, err
	}
	tableInfo := tbl.Meta() 
	...
	// Check whether it is  virtual table
	if tbl.Type().IsVirtualTable() {
		if tn.TableSample != nil {
			return nil, expression.ErrInvalidTableSample.GenWithStackByArgs("Unsupported TABLESAMPLE in virtual tables")
		}
		return b.buildMemTable(ctx, dbName, tableInfo)
	}
	//  Verify whether it is a view 
	if tableInfo.IsView() {
		if tn.TableSample != nil {
			return nil, expression.ErrInvalidTableSample.GenWithStackByArgs("Unsupported TABLESAMPLE in views")
		}
		return b.BuildDataSourceFromView(ctx, dbName, tableInfo)
	}
	//  Verify whether the table is a sequence object 
	if tableInfo.IsSequence() {
		if tn.TableSample != nil {
			return nil, expression.ErrInvalidTableSample.GenWithStackByArgs("Unsupported TABLESAMPLE in sequences")
		}
 		return b.buildTableDual(), nil
	}
	//  Verify whether there are partitions 
	if tableInfo.GetPartitionInfo() != nil {
		...
	} else if len(tn.PartitionNames) != 0 {
		return nil, ErrPartitionClauseOnNonpartitioned
	}

	tblName := *asName
	if tblName.L == "" {
		tblName = tn.Name
	}
	//  Here should be the index that may be used to get 
	possiblePaths, err := getPossibleAccessPaths(b.ctx, b.TableHints(), tn.IndexHints, tbl, dbName, tblName, b.isForUpdateRead, b.is.SchemaMetaVersion())
	if err != nil {
		return nil, err
	}  
	//  Get the fields of the table 
	var columns []*table.Column
	if b.inUpdateStmt { 
		columns = tbl.WritableCols()
	} else if b.inDeleteStmt {
 		columns = tbl.FullHiddenColsAndVisibleCols()
	} else {
		columns = tbl.Cols()
	}
	var statisticTable *statistics.Table
	if _, ok := tbl.(table.PartitionedTable); !ok || b.ctx.GetSessionVars().UseDynamicPartitionPrune() {
		statisticTable = getStatsTable(b.ctx, tbl.Meta(), tbl.Meta().ID)
	} 
	//  structure DataSource Structure 
	ds := DataSource{
		DBName:              dbName,
		TableAsName:         asName,
		table:               tbl,
		tableInfo:           tableInfo,
		statisticTable:      statisticTable,
		astIndexHints:       tn.IndexHints,
		IndexHints:          b.TableHints().indexHintList,
		indexMergeHints:     indexMergeHints,
		possibleAccessPaths: possiblePaths,
		Columns:             make([]*model.ColumnInfo, 0, len(columns)),
		partitionNames:      tn.PartitionNames,
		TblCols:             make([]*expression.Column, 0, len(columns)),
		preferPartitions:    make(map[int][]model.CIStr),
		is:                  b.is,
		isForUpdateRead:     b.isForUpdateRead,
	}.Init(b.ctx, b.getSelectOffset())
	var handleCols HandleCols
	schema := expression.NewSchema(make([]*expression.Column, 0, len(columns))...)
	names := make([]*types.FieldName, 0, len(columns))
	//  Add fields 
	for i, col := range columns {
		ds.Columns = append(ds.Columns, col.ToInfo())
		names = append(names, &types.FieldName{
			DBName:            dbName,
			TblName:           tableInfo.Name,
			ColName:           col.Name,
			OrigTblName:       tableInfo.Name,
			OrigColName:       col.Name,
			Hidden:            col.Hidden,
			NotExplicitUsable: col.State != model.StatePublic,
		})
		newCol := &expression.Column{
			UniqueID: sessionVars.AllocPlanColumnID(),
			ID:       col.ID,
			RetType:  col.FieldType.Clone(),
			OrigName: names[i].String(),
			IsHidden: col.Hidden,
		}
		//  Check whether it is int Primary key of type 
		if col.IsPKHandleColumn(tableInfo) {
			handleCols = &IntHandleCols{col: newCol}
		}
		schema.Append(newCol)
		ds.TblCols = append(ds.TblCols, newCol)
	} 
	//  without int  Primary key of type , By default, a 
	if handleCols == nil {
		if tableInfo.IsCommonHandle {
			primaryIdx := tables.FindPrimaryIndex(tableInfo)
			handleCols = NewCommonHandleCols(b.ctx.GetSessionVars().StmtCtx, tableInfo, primaryIdx, ds.TblCols)
		} else {
			extraCol := ds.newExtraHandleSchemaCol()
			handleCols = &IntHandleCols{col: extraCol}
			ds.Columns = append(ds.Columns, model.NewExtraHandleColInfo())
			schema.Append(extraCol)
			names = append(names, &types.FieldName{
				DBName:      dbName,
				TblName:     tableInfo.Name,
				ColName:     model.ExtraHandleName,
				OrigColName: model.ExtraHandleName,
			})
			ds.TblCols = append(ds.TblCols, extraCol)
		}
	}
	ds.handleCols = handleCols
	handleMap := make(map[int64][]HandleCols)
	handleMap[tableInfo.ID] = []HandleCols{handleCols}
	b.handleHelper.pushMap(handleMap)
	ds.SetSchema(schema) 
	ds.names = names
	ds.setPreferredStoreType(b.TableHints())
	ds.SampleInfo = NewTableSampleInfo(tn.TableSample, schema.Clone(), b.partitionedTable)
	b.isSampling = ds.SampleInfo != nil 
	... 
	return result, nil
}

stay buildDataSource The method is mainly built based on the original data information of the table DataSource Structure .DataSource It mainly records various basic information of the table :

For the time being, I will just talk about building DataSource This process , Other code that you are interested in can be browsed .

DoOptimize

Let's go back to optimize Function , stay builder.Build After the execution plan is built, the plannercore.DoOptimize, Perform logical optimization first , Logic optimization is mainly rule-based optimization , abbreviation RBO(rule based optimization). Then optimize the logical plan based on the cost (CBO) For a physical plan , namely Cost-Based Optimization(CBO) The process of .

func DoOptimize(ctx context.Context, sctx sessionctx.Context, flag uint64, logic LogicalPlan) (PhysicalPlan, float64, error) { 
	if flag&flagPrunColumns > 0 && flag-flagPrunColumns > flagPrunColumns {
		flag |= flagPrunColumnsAgain
	}
	//  First, execute the corresponding logic optimization according to the built execution plan 
	logic, err := logicalOptimize(ctx, flag, logic)
	if err != nil {
		return nil, 0, err
	}
	if !AllowCartesianProduct.Load() && existsCartesianProduct(logic) {
		return nil, 0, errors.Trace(ErrCartesianProductUnsupported)
	}
	planCounter := PlanCounterTp(sctx.GetSessionVars().StmtCtx.StmtHints.ForceNthPlan)
	if planCounter == 0 {
		planCounter = -1
	}
	//  Then physical optimization 
	physical, cost, err := physicalOptimize(logic, &planCounter)
	if err != nil {
		return nil, 0, err
	}
	//  Finally, optimize 
	finalPlan := postOptimize(sctx, physical)
	return finalPlan, cost, nil
}

Let's take a look at logic optimization first .

Logic optimization

func logicalOptimize(ctx context.Context, flag uint64, logic LogicalPlan) (LogicalPlan, error) {
	var err error
    //  Traverse optimization rules 
	for i, rule := range optRuleList { 
		if flag&(1<<uint(i)) == 0 || isLogicalRuleDisabled(rule) {
			continue
		} 
        //  Perform optimization 
		logic, err = rule.optimize(ctx, logic)
		if err != nil {
			return nil, err
		}
	}
	return logic, err
}

The logic optimization will traverse all optRuleList Optimize the rules , Then perform the optimization according to the optimization rules . at present TIDB There are mainly these optimization rules in :

var optRuleList = []logicalOptRule{
	&gcSubstituter{},
	&columnPruner{},
	&buildKeySolver{},
	&decorrelateSolver{},
	&aggregationEliminator{},
	&projectionEliminator{},
	&maxMinEliminator{},
	&ppdSolver{},
	&outerJoinEliminator{},
	&partitionProcessor{},
	&aggregationPushDownSolver{},
	&pushDownTopNOptimizer{},
	&joinReOrderSolver{},
	&columnPruner{}, // column pruning again at last, note it will mess up the results of buildKeySolver
}

Each line here is an optimizer , for example gcSubstituter Used to replace an expression with a virtual generated column , In order to use the index ;columnPruner Used to trim columns , That is, remove the unused columns , Avoid reading them out , To reduce the amount of data read ;aggregationEliminator In the group by {unique key} Eliminate unnecessary aggregate calculations , To reduce the amount of calculation ;

Here are a few examples :

columnPruner Column cut

Column clipping is mainly used to remove the unused columns in the operator , To reduce the total amount of data read , For example, the table in our example above queries the names of all students :

select sum(age) as total_age from student where name='pingcap' group by name;

For this SQL To use age,name These fields , And then build LogicalPlan When DataSource、LogicalSelection、LogicalAggregation、Projection These operators are implemented PruneColumns Methodical :

So in execution PruneColumns Method will recursively execute nested sub methods , Then get all the fields used , Remove unused fields .

Predicate push-down Predicate Push Down(PDD)

The basic idea of predicate pushdown is Move the filter expression as close to the data source as possible , So that we can skip the irrelevant data directly during the real execution .

For example, the following simple SQL:

select * from student where name='pingcap';

In this query , The predicate name='pingcap' Push down to TiKV Filter data on , It can reduce the overhead caused by network transmission .

However, predicate pushdown has many limitations , for example Limit Nodes cannot be pushed down , After all, filter first , Again limit, And first limit, Re screening is two concepts ; And predicates on inner tables in outer joins cannot be pushed down , Because the external connection does not meet on Condition will fill in the inner table NULL, Cannot filter directly .

Physical optimization

In this phase , The optimizer selects a specific physical implementation for each operator in the logical execution plan , To convert the logical execution plan generated in the logical optimization stage into a physical execution plan .

The time complexity of different corresponding physical implementations of logical operators 、 Resource consumption and physical properties are also different . In the process , The optimizer will determine the cost of different physical implementations based on the statistics of the data , And select the physical execution plan with the lowest overall cost .

For example, our SQL :

select sum(age) as total_age from student where name='pingcap' group by name;

After logical optimization , Such a logical plan will be generated :

The logical operators in this statement have DataSource、Aggregation and Projection. For example, for DataSource In this phase of physical optimization, you need to choose IndexReader Read data through index , still TableReader adopt row ID Reading data , Or is it IndexLookUpReader Read data through the back table .

physicalOptimize

func physicalOptimize(logic LogicalPlan, planCounter *PlanCounterTp) (PhysicalPlan, float64, error) { 
	//  It is used to store the requirements of each operator for the received lower level return data 
	prop := &property.PhysicalProperty{
		TaskTp:      property.RootTaskType,
		ExpectedCnt: math.MaxFloat64,
	}

	logic.SCtx().GetSessionVars().StmtCtx.TaskMapBakTS = 0
	//  Converting a logical plan into a physical plan  
	t, _, err := logic.findBestTask(prop, planCounter)
	if err != nil {
		return nil, 0, err
	}
	...
	return t.plan(), t.cost(), err
}

findBestTask It will recursively call the lower nodes findBestTask Method , Generate physical operators and estimate their costs , Then choose the least costly solution .

type LogicalPlan interface {
	// findBestTask converts the logical plan to the physical plan. It's a new interface.
	// It is called recursively from the parent to the children to create the result physical plan.
	// Some logical plans will convert the children to the physical plans in different ways, and return the one
	// With the lowest cost and how many plans are found in this function.
	// planCounter is a counter for planner to force a plan.
	// If planCounter > 0, the clock_th plan generated in this function will be returned.
	// If planCounter = 0, the plan generated in this function will not be considered.
	// If planCounter = -1, then we will not force plan.
	findBestTask(prop *property.PhysicalProperty, planCounter *PlanCounterTp) (task, int64, error)
  //..
}

findBestTask yes LogicalPlan A method of interface , The main purpose is to turn logical plans into physical plans .

func (p *baseLogicalPlan) findBestTask(prop *property.PhysicalProperty, planCounter *PlanCounterTp) (bestTask task, cntPlan int64, err error) {
	
	bestTask = p.getTask(prop)
	if bestTask != nil {
		planCounter.Dec(1)
		return bestTask, 1, nil
	}
 

	bestTask = invalidTask
	cntPlan = 0
	newProp := prop.CloneEssentialFields()
	var plansFitsProp, plansNeedEnforce []PhysicalPlan
	var hintWorksWithProp bool 
	//  Return all physical plans under the logical operator 
	plansFitsProp, hintWorksWithProp, err = p.self.exhaustPhysicalPlans(newProp)
	if err != nil {
		return nil, 0, err
	} 
	...
	var cnt int64
	var curTask task
	//  Find the best task
	if bestTask, cnt, err = p.enumeratePhysicalPlans4Task(plansFitsProp, newProp, false, planCounter); err != nil {
		return nil, 0, err
	} 
	...
END:
	p.storeTask(prop, bestTask)
	return bestTask, cntPlan, nil
}

func (p *baseLogicalPlan) enumeratePhysicalPlans4Task(physicalPlans []PhysicalPlan, prop *property.PhysicalProperty, addEnforcer bool, planCounter *PlanCounterTp) (task, int64, error) {
	var bestTask task = invalidTask
	var curCntPlan, cntPlan int64
	childTasks := make([]task, 0, len(p.children))
	childCnts := make([]int64, len(p.children))
	cntPlan = 0
	for _, pp := range physicalPlans {
		 
		childTasks = childTasks[:0] 
		curCntPlan = 1
		TimeStampNow := p.GetlogicalTS4TaskMap()
		savedPlanID := p.ctx.GetSessionVars().PlanID
		//  Recursively look up the child task
		for j, child := range p.children {
			childTask, cnt, err := child.findBestTask(pp.GetChildReqProps(j), &PlanCounterDisabled)
			childCnts[j] = cnt
			if err != nil {
				return nil, 0, err
			}
			curCntPlan = curCntPlan * cnt
			if childTask != nil && childTask.invalid() {
				break
			}
			childTasks = append(childTasks, childTask)
		}

		if len(childTasks) != len(p.children) {
			continue
		}
		//  Jiangzi task Add to parent task Collection 
		curTask := pp.attach2Task(childTasks...) 
		...
       	//  Calculate the cost 
		if curTask.cost() < bestTask.cost() || (bestTask.invalid() && !curTask.invalid()) {
			bestTask = curTask
		}
	}
	return bestTask, cntPlan, nil
}

All physical plans here will return task structure .

// task is a new version of `PhysicalPlanInfo`. It stores cost information for a task.
type task interface {
	count() float64
	addCost(cost float64)
	cost() float64
	copy() task
	plan() PhysicalPlan
	invalid() bool
}

task There are two kinds , roottask stay TiDB End execution

  • rootTask is the final sink node of a plan graph. It should be a single goroutine on tidb.
  • copTask is a task that runs in a distributed kv store.

summary

The optimization process is very long , It is also very complicated , Here we can only say that we are throwing a brick to attract jade , About this process , Many details are very difficult to understand when following the source code , I will analyze many details later .

Reference

https://pingcap.com/blog-cn/tidb-source-code-reading-3

https://tech.meituan.com/2018/05/20/sql-parser-used-in-mtdp.html

https://pingcap.com/zh/blog/tidb-source-code-reading-5

https://pingcap.com/zh/blog/tidb-source-code-reading-6

https://pingcap.com/zh/blog/tidb-source-code-reading-7

https://pingcap.com/zh/blog/tidb-source-code-reading-8

https://zhuanlan.zhihu.com/p/373889188

https://lenshood.github.io/2020/10/03/tidb-lesson-6/

https://github.com/xieyu/blog/blob/master/src/tidb/datasource-physical-optimize.md

Sweep code _ Search for syndication patterns - White version 1
原网站

版权声明
本文为[luozhiyun]所创,转载请带上原文链接,感谢
https://yzsam.com/2021/09/20210920013403992g.html