当前位置：网站首页>Design and implementation of high performance go log library zap

Design and implementation of high performance go log library zap

2022-06-24 12:35:00 【luozhiyun】

Please state the source of reprint ~, This article was published at luozhiyun The blog of ：https://www.luozhiyun.com/archives/542

Recently, I'm also learning how to make code run more efficiently in development , And then when I browse all kinds of excellent log designs, I see uber One called zap I've got my attention in my journal , Its main feature is the extreme optimization of performance and memory allocation .

For me , Originally used in the project logrus As log output , But see zap Of benchmark, I feel that in terms of performance logrus I don't know how many streets , So that's what drives me to see how it optimizes .

Apart from that, of course , I want to be more familiar with it , In order to make more efficient use of this library , And avoid being unable to locate when problems occur .

Now let's put some more bluffing benchmark , To provide you with the motivation to look down ：

Package	Time	Time % to zap	Objects Allocated
zap	862 ns/op	+0%	5 allocs/op
zap (sugared)	1250 ns/op	+45%	11 allocs/op
zerolog	4021 ns/op	+366%	76 allocs/op
go-kit	4542 ns/op	+427%	105 allocs/op
apex/log	26785 ns/op	+3007%	115 allocs/op
logrus	29501 ns/op	+3322%	125 allocs/op
log15	29906 ns/op	+3369%	122 allocs/op

zap Design

log Instantiation

At the beginning of use , We can learn about it through official examples zap Internal components ：

log := zap.NewExample()

NewExample The function shows how to pass NewCore To create a Core Structure , According to the name, we should also be able to guess that this structure is zap At the heart of .

For a journal , The most important is nothing more than these three categories ：

How to serialize the input data ;
Where to store the input data after serialization , Console or file , Or somewhere else ;
And then there's the log level , yes Debug、Info Or is it Error;

Empathy zap In the same way , In the use of NewCore establish Core The three parameters that need to be passed in when constructing a structure correspond to ： The encoder of the input data Encoder、 The destination of the log data WriteSyncer, And log level LevelEnabler.

except NewExample Besides this construction method ,zap It also provides NewProduction、NewDevelopment To construct log instances ：

log, _  = zap.NewProduction()
log, _  = zap.NewDevelopment()

These two functions will build a Config The structure is then called Build Method to create NewCore Required parameters , Then instantiate the log instance .

zap

Output of log data

Initializing log After instance , It can be used Info、Debug、Error And so on ：

	log  = zap.NewExample()
	url := "http://example.org/api"
	log.Info("failed to fetch URL",
		zap.String("url", url),
		zap.Int("attempt", 3),
		zap.Duration("backoff", time.Hour),
	)

Let's take another look zap The implementation steps of printing a structured log ：

First, we will check the level of log configuration , for example Error Log configuration level cannot be output Debug Log out ;
And then encapsulate the log data into a Entry example ;
Because in zap You can pass in multiCore, So it's going to take more than one Core Add to CheckedEntry In the example ;
Traverse CheckedEntry In the example Cores,
1. according to Core Medium Encoder To serialize log data to Buffer in ;
2. Again by WriteSyncer take Buffer Output the log data of ;

Interface and framework design

zap2

In the design of code structure , Through simple interface encapsulation , Realize the configuration combination of various styles , To meet all kinds of needs . In the top layer of the design to achieve the three log For different functions ：

Logger： It's cumbersome to use , Only structured output can be used , But the performance is better ;

SugaredLogger： have access to Printf To output logs , Performance comparison Logger The comparison is poor 40% about ;

zapgrpc： Used as a grpc Log output of ;

In design Logger It can be easily transformed into SugaredLogger and zapgrpc. These are a few Logger Need to pass in a Core Interface implementation class can be created .

Core Interface ：zap It also provides a variety of implementation options ：NewNopCore 、ioCore、multiCore 、hook.

The most common is ioCore、multiCore , It can be seen from the name multiCore It can contain more than one ioCore A configuration of , For example, you can make Error Log output a log format and set a log output destination , Give Way Info Logs are exported to other places in another log format .

I also said , about Core Implementation class of ioCore For example, it needs to pass in three objects ： The encoder of the input data Encoder、 The destination of the log data WriteSyncer, And log level LevelEnabler.

Encoder Interface ：zap Provides consoleEncoder、jsonEncoder The implementation of the , Respectively provided console Format and JSON Format log output , these Encoder Each has its own serialization implementation , It's faster to format the code ;

EncoderConfig： Above mentioned Encoder Can also be based on EncoderConfig The configuration allows users to flexibly configure the output format of the log , From the key name of the log message 、 Log level name , To the definition of time format output , The definition of method name can be flexibly configured through it .

WriteSyncer Interface ：zap Provides writerWrapper Single log output implementation of , And can output logs to multiple places multiWriteSyncer Realization ;

Entry ： That's it , To the encapsulation of log data . First, the log data is encapsulated into a Entry, Contains the log name 、 Log time 、 The log level , And log data , No, Field Information , And then experience Core Of Check Method will generate a CheckedEntry example .

CheckedEntry Contains all the information of the log data , Including the above Entry、 Call stack information, etc .

performance

Use object pool

zap adopt sync.Pool Provided object pool , Reuse a large number of reusable objects , If the sync.Pool Students who don't know very well , You can read this article ：《 Many pictures explain Go Of sync.Pool Source code https://www.luozhiyun.com/archives/416 》.

zap In instantiation CheckedEntry 、Buffer、Encoder Waiting for the object , Directly from the object pool , Instead of instantiating a new , In this way, reusing objects can reduce GC The pressure of the , Reduce memory allocation .

Avoid reflection

If we use the official log library , Output logs like this ：

log.Printf("%s login, age:%d", "luoluo", 19)

log Called Printf The function actually calls fmt.Sprintf Function to format log data , And then output ：

func Printf(format string, v ...interface{}) {
	std.Output(2, fmt.Sprintf(format, v...))
}

however fmt.Sprintf Efficiency is actually very low , By looking at fmt.Sprintf Source code , We can see that there are two reasons for low efficiency ：

fmt.Sprintf The type of acceptance is interface{}, Reflection is used internally ;
fmt.Sprintf The purpose of is to format strings , You need to parse the format string , such as %s、 %d And so on. , It increases the parsing time .

But in zap in , It's built-in Encoder, It's going through the internal Buffer With byte To splice log data in the form of , Reduce the performance loss caused by reflection ; as well as zap It's the structured log used , So there was no %s、 %d Such identifiers need to be resolved , It's also a performance improvement point .

More efficient and flexible serializers

stay zap I have realized consoleEncoder、jsonEncoder Two serializers , Both of these serializers can be based on the incoming EncoderConfig To achieve flexible configuration of log format , This flexible configuration is not just for log output key The name of , But through EncoderConfig To call the user-defined Encoder Realization .

And like logrus The serialization JSON We use the serialization tool of the standard library , It's also less efficient .

zap The code analysis

Because I feel zap My code is elegant , So here's a little bit of source code .

initialization

initialization Core

We also see in the picture above ,Core Yes, there is 4 Implementation classes , We use the most commonly used ioCore As an example to explain .

type ioCore struct {
	LevelEnabler
	enc Encoder
	out WriteSyncer
}

ioCore It's very simple inside , A total of three fields are required , Namely ： The encoder of the input data Encoder、 The destination of the log data WriteSyncer, And log level LevelEnabler.

func NewCore(enc Encoder, ws WriteSyncer, enab LevelEnabler) Core {
	return &ioCore{
		LevelEnabler: enab,
		enc:          enc,
		out:          ws,
	}
}

In the use of NewCore Function creation ioCore It also returns an object pointer .

initialization Logger

zap Will pass New Function to instantiate a Logger：

func New(core zapcore.Core, options ...Option) *Logger {
	if core == nil {
		return NewNop()
	}
	log := &Logger{
		core:        core,
		errorOutput: zapcore.Lock(os.Stderr),
		addStack:    zapcore.FatalLevel + 1,
		clock:       _systemClock,
	}
	return log.WithOptions(options...)
}

New Function will set the corresponding default fields , Include core example 、 Error log output 、 The output level of the stack log 、 Log time, etc , Then instantiate a Logger Object returns a pointer .

Logger The information about the structure is as follows ：

type Logger struct {
	core zapcore.Core
	//  Whether it's a development model 
	development bool
	//  Whether to print the line number 
	addCaller   bool
	onFatal     zapcore.CheckWriteAction // default is WriteThenFatal

	name        string
	//  Error log output 
	errorOutput zapcore.WriteSyncer
	//  Output call stack 
	addStack zapcore.LevelEnabler
	//  Print the caller's line number 
	callerSkip int

	clock Clock
}

Logger The structure contains a lot of configuration information , We can develop through WithOptions To add the corresponding parameters . Such as adding log line number ：

log := zap.New(core).WithOptions(zap.AddCaller())

AddCaller Function creates a callback hook for WithOptions perform , That's what functional programming is all about ：

func (log *Logger) WithOptions(opts ...Option) *Logger {
	c := log.clone()
	for _, opt := range opts {
    //  call  Option  Interface method  
		opt.apply(c)
	}
	return c
}

WithOptions You can pass in Option Array , Then traverse the array and call apply Method ,Option It's an interface , Provided only apply Method ：

type optionFunc func(*Logger)

func (f optionFunc) apply(log *Logger) {
	f(log)
}
//  Definition  Option  Interface 
type Option interface {
	apply(*Logger)
}

func AddCaller() Option {
  //  return  Option
	return WithCaller(true)
}

func WithCaller(enabled bool) Option {
  //  take  func  Strong conversion  optionFunc  type 
	return optionFunc(func(log *Logger) {
		log.addCaller = enabled
	})
}

The code here is very interesting , stay go A function in is also a type , and struct There's a way to do it .

ad locum optionFunc As a function type , It has achieved apply Method , So it's equivalent to inheriting Option This interface . And then in WithCaller Use in optionFunc Wrap a function in a layer , It's kind of amazing , But actually with int64(123) There is no essential difference .

And then in WithOptions Function will get WithCaller The returned one turns into optionFunc Function of type , And pass in log perform , That's a change log Of addCaller attribute .

Do not understand the compiler can try on their own .

Print log

The whole process of printing log is as follows ：

zap3

The first is acquisition CheckedEntry example , Encapsulate the corresponding log data ;
And then according to core Enclosed inside encoder Encoding , Put the encoded content into buffer in ;
take buffer The content in is output to core Enclosed in WriteSyncer in .

After we finish the initialization Logger after , You can use it to call the corresponding Info、Warn、Error And so on . Because all log level output methods are the same , So this is through Info Methods to analyze .

func (log *Logger) Info(msg string, fields ...Field) {
    //  Check whether the log should be printed 
	if ce := log.check(InfoLevel, msg); ce != nil {
        //  Print log 
		ce.Write(fields...)
	}
}

This method first calls check Method for verification , It is mainly to check whether the current log data should be printed under the configured log level .

about Info At the log level, it passes in InfoLevel,Error At the log level, it passes in ErrorLevel, stay zap The log level is defined by these constants ：

type Level int8

const ( 
	DebugLevel Level = iota - 1 
	InfoLevel 
	WarnLevel 
	ErrorLevel 
	DPanicLevel 
	PanicLevel 
	FatalLevel

	_minLevel = DebugLevel
	_maxLevel = FatalLevel
)

The smallest DebugLevel from -1 Start .

check Check

func (log *Logger) check(lvl zapcore.Level, msg string) *zapcore.CheckedEntry { 
	const callerSkipOffset = 2
	//  Determine whether the incoming log level should be printed 
	if lvl < zapcore.DPanicLevel && !log.core.Enabled(lvl) {
		return nil
	}
 
	//  Encapsulate the log data into a  Entry
	ent := zapcore.Entry{
		LoggerName: log.name,
		Time:       log.clock.Now(),
		Level:      lvl,
		Message:    msg,
	}
	// If the log can be written, a  CheckedEntry  Instance pointer 
	ce := log.core.Check(ent, nil)
	willWrite := ce != nil 
	... 
	if !willWrite {
		return ce
	}
 
	ce.ErrorOutput = log.errorOutput
	//  Determine whether to print the calling line number 
	if log.addCaller {
		//  Get the stack frame of the caller 
		frame, defined := getCallerFrame(log.callerSkip + callerSkipOffset)
		if !defined {
			fmt.Fprintf(log.errorOutput, "%v Logger.check error: failed to get caller\n", ent.Time.UTC())
			log.errorOutput.Sync()
		}
		//  Set value callers  entry
		ce.Entry.Caller = zapcore.EntryCaller{
			Defined:  defined,
			PC:       frame.PC,
			File:     frame.File,
			Line:     frame.Line,
			Function: frame.Function,
		}
	}
	if log.addStack.Enabled(ce.Entry.Level) {
		//  Encapsulating call stack information 
		ce.Entry.Stack = StackSkip("", log.callerSkip+callerSkipOffset).String
	} 
	return ce
}

First of all, we call core Of Enabled Method to determine whether the log should be printed . This judgment is because the log level is actually a int8 Type of , So it can be judged directly according to the size .

func (l Level) Enabled(lvl Level) bool {
	return lvl >= l
}

After judging that there is no problem, it will call Check Method to get CheckedEntry Instance pointer . Get it CheckedEntry After the instance pointer, the value will be set according to the configuration information , Then return .

Let's see how to get CheckedEntry Instance pointer .

func (c *ioCore) Check(ent Entry, ce *CheckedEntry) *CheckedEntry {
	//  Check that  level  Whether the log should be printed 
	if c.Enabled(ent.Level) {
		//  obtain  CheckedEntry
		return ce.AddCore(ent, c)
	}
	return ce
}

It will pass here CheckedEntry Of AddCore Method to get , The main thing we need is the incoming ce It's a nil The pointer , But it's not a plan Go Call its AddCore Method （ If you put it in the java It's time to report a mistake ）.

var (
	_cePool = sync.Pool{New: func() interface{} {
		// Pre-allocate some space for cores.
		return &CheckedEntry{
			cores: make([]Core, 4),
		}
	}}
)

func (ce *CheckedEntry) AddCore(ent Entry, core Core) *CheckedEntry {
	if ce == nil {
		//  from  _cePool  Get in there  CheckedEntry  example 
		ce = getCheckedEntry()
		ce.Entry = ent
	}
    //  Because it could be  multi core  So you need  append  once 
	ce.cores = append(ce.cores, core)
	return ce
}

func getCheckedEntry() *CheckedEntry {
	//  from  pool  Get object 
	ce := _cePool.Get().(*CheckedEntry)
	//  Reset the properties of the object 
	ce.reset()
	return ce
}

AddCore The method is also very simple , You should see it at a glance , Not much to say .

Write Log printing

func (ce *CheckedEntry) Write(fields ...Field) {
	if ce == nil {
		return
	}
	... 
	var err error
	//  Traverse all of  core  Write log data 
	for i := range ce.cores {
		err = multierr.Append(err, ce.cores[i].Write(ce.Entry, fields))
	}
	...
	//  take  CheckedEntry  Put it back in the cache pool 
	putCheckedEntry(ce)
	...
}

This is the call core Of Write Method to write log data , Keep going .

func (c *ioCore) Write(ent Entry, fields []Field) error {
	//  call  Encoder  Of  EncodeEntry  Method to encode the log data 
	buf, err := c.enc.EncodeEntry(ent, fields)
	if err != nil {
		return err
	}
	//  Log data through  WriteSyncer  write in 
	_, err = c.out.Write(buf.Bytes())
	//  take buffer Put it back in the cache pool 
	buf.Free()
	if err != nil {
		return err
	}
	if ent.Level > ErrorLevel {
		c.Sync()
	}
	return nil
}

Write The method depends on the encoder , Then the corresponding encoding is called EncodeEntry Method . Whether it's jsonEncoder still consoleEncoder Will be in EncodeEntry Method from bufferpool Get one Buffer example , Then encapsulate the data in a certain format Buffer In the example .

After getting the data , Would call WriteSyncer Of Write Method to write the log data to .

The final will be Buffer Instance is released back to bufferpool in .

summary

This article mainly explains zap The design principle and code implementation of . We can see that through the design of coding structure, it can realize rich functions through simple configuration . In terms of performance , Mainly by using object pools , Reduce memory allocation overhead ; Built in high performance serializer , Reduce the overhead on serialization ; And through structured log formats , Reduce fmt.Sprintf The overhead of formatting log data .