Building a Golang Protoc Plugin to SQL Scan+Value Enums

2025-06-30

At Sentiance (where I work as a Software Engineer), we use gRPC and protobuf to communicate between services in a typesafe manner.

Here’s a recent problem I faced. I had several enums in protobuf files that I wanted to save as database columns. But instead of their integer representation, I wanted to store their string representations.

Let’s say this is the enum I wanted to store.

package userama;

enum UserType {
	UNSPECIFIED = 0;
	SUPER = 1;
	COMMON = 2;
}

The generated Golang code for our enum looks like this.

type UserType int32  
  
const (  
    UserType_UNSPECIFIED UserType = 0  
    UserType_SUPER       UserType = 1  
    UserType_COMMON      UserType = 2  
)  
  
// Enum value maps for UserType.  
var (  
    UserType_name = map[int32]string{  
       0: "UNSPECIFIED",  
       1: "SUPER",  
       2: "COMMON",  
    }  
    UserType_value = map[string]int32{  
       "UNSPECIFIED": 0,  
       "SUPER":       1,  
       "COMMON":      2,  
    }  
)

...more stuff

Note the two exported maps, <EnumName>_name and <EnumName>_value. These allow us to convert between the string and int32 representations of our enum values.

Let’s say this is the struct representing the database row.

type User struct {
	ID    string           `pg:"id"`
	Name  string           `pg:"name"`
	Email string           `pg:"email"`
	Type  userama.UserType `pg:"type"`
}

Since the underlying type of the generated enum code in Golang are int32s, storing this struct naively in the database (using a library that uses the corresponding struct tags to map the fields to column names) would result in the type column storing an integer value.

For a value:

user := User{
	Id: "dwedwedwdwd",
	Name: "Shaunion",
	Email: "[email protected]",
	Type: userama.UserType_SUPER
}

We get the table:

idnameemailtype
dwedwedwdwdShaunion[email protected]1

But those aren’t very human readable if, for some reason, we need to inspect your database columns directly. We want to store the string representation of the enum in the database and then cast it to the typed int32 when reading it back.

How do we do this?

SQL Scanner + Valuer Interfaces

We can implement the Scanner and Valuer interfaces, using a wrapper that takes advantages of the exported maps.

type UserTypeWrapper struct {
	userama.UserType
}

// Value implements the SQL Valuer interface, allowing us
// to map a Golang value into a postgres column.
func (u *UserTypeWrapper) Value() (sql.Value, error) {

	// Since the underlying type is just an int32, you could technically
	// assign any value, whether predefined or not, to the field. It's
	// good to have a check.
	if stringVal, present := userama.UserType_name[u.UserType]; present {
		return stringVal, nil
	}

	return nil, fmt.Errorf("%d is not a valid value", u.UserType)
}

// Scan implements the SQL Scanner interface, allowing us to
// map a value from a postgres column to a variable.
func (u *UserTypeWrapper) Scan(src any) error {

	asString, cast := src.(string)
	if !cast {
		return fmt.Errorf("%v is not a valid string", src)
	}

	int32Val, present := userama.UserType_value[asString]
	if !present {
		return fmt.Errorf("%s is not a valid value", asString)
	}

	u.UserType = int32Val
	return nil
}

Note: We could have also used u.UserType.String() to get the string version of the enum, the only problem with that being the lack of any validation. If we were to assign the value 56 to u.UserType, String() wouldn’t complain. It would fall back to printing 56, which is a very confusing error to try to debug.

The only problem with this approach is that we will have to do this for every single enum type. We will have to only do this once for each new enum type you introduce, but introducing a new enum type now comes with extra work.

We can do better by creating a protobuf plugin to handle this for us.

Parts of a Protobuf Plugin

When generating go code using the protoc tool like so.

protoc --go_out=. --go_opt=paths=source_relative

We’re making use of at least two separate binaries (that I am aware of): protoc and protoc-gen-go.

Protoc

protoc is the protobuf compiler. It is installed on my machine via homebrew (there are also other ways to install it).

➜  ~ which protoc
/opt/homebrew/bin/protoc

It is the entry point for protobuf compilation (transpilation?). It parses and validates incoming protobuf files, resolving any dependencies along the way, ensuring type safety and any other goodness for which we prefer strict data schemas.

It’s output is not Golang code, but something called a Code Generation Request (itself represented in protobuf wire format). This is then piped to the Stdin of whatever code generator plugin we want to use to generate our final output.

That’s what the --go_out and --go_opt flags do. They indicate to the protoc binary that we want to generate Golang code and store it at a certain path (and with some options).

As you might have guessed, protoc looks for a binary in our path named protoc-gen-go or more generally for a given flag --<name>_out it will look for a binary protoc-gen-<name>.

Protoc Go Plugin

protoc-gen-go is the official plugin for generating Golang code from a protobuf file. It takes as its input a Code Generation Request via Stdin and then writes out the corresponding Golang code to its Stdout. This is then read back by the protoc compiler and stored in the correct paths.

If we want to pass options to the plugin we can do so via the --go_opt flag.

Building our own Plugin

Let’s build a plugin that augments the Golang code generated by protoc-gen-go and adds the methods that would make it satisfy the Scanner and Valuer interfaces.

We start off by defining the basic structure.


import (
	...
	"google.golang.org/protobuf/compiler/protogen"
	...
)

func main() { 
	if err := run(context.Background(), os.Args[1:], os.Stdout); err != nil { 
		os.Exit(1) // Only exit point in the program.
	} 
} 

func run(ctx context.Context, args []string, stdout io.Writer) error { 
	// Handle version command.
	if len(args) > 0 && args[0] == "version" { 
		fmt.Fprintln(stdout, version) 
		return nil 
	} 
	// Main plugin logic.
	protogen.Options{}.Run(func(gen *protogen.Plugin) error { 
		// Process proto files here.
		return nil 
	}) 
	
	return nil 
}

Handling the version command isn’t strictly necessary but is a nice quality of life addition to our plugin.

After some basic boiler plate our plugin code will start executing by invoking protogen.Optipns{}.Run. It takes a callback which provides us with the parsed (and validated) protobuf file. gen.Files is where each of the files that were fed to the compiler are available.

Since we are targeting enums, we need to loop over the input files and gather up the enums.

	...
	protogen.Options{}.Run(func(gen *protogen.Plugin) error { 
	
		for _, f := range gen.Files { 
			if !f.Generate { 
				continue // Skip files not marked for generation, this includes Google's well-known types as well as any imports or dependencies which the compiler would have pulled in for type-checking but which we aren't inrerested in.
			} 
			
			// TODO: process the file.
		}

		return nil 
	}) 
	...

Protobuf enums can appear in two places: top-level enums, and nested enums.

package userama;

enum UserType {
	UNSPECIFIED = 0;
	SUPER = 1;
	COMMON = 2;
}

message User {
	enum Type {
		UNSPECIFIED = 0;
		SUPER = 1;
		COMMON = 2;
	}
}

Both are valid ways of defining enums and which one you pick would depend on your usecase. If an enum is hyper-specific to a message type, you might define it as part of that message type itself. In this case you would refer to it as User.Type instead of UserType. For enums that might be shared by many messages it makes more sense to create a package level (or top level) enum.

We need to account for both cases in our code.

// TemplateData holds the data structure to be used by Golang's text/template package.
type TemplateData struct {
    PackageName string  
    Enums       []EnumData  
    Version     string  
}  
  
type EnumData struct {  
    Name        string
}

Next we collect our enums.

...

for _, f := range gen.Files { 

	data := TemplateData{  
	    PackageName: string(file.GoPackageName),  
	    Enums:       make([]EnumData, 0),  
	    Version:     version // It's a good idea to include the version of your tool that generated this code.
	}

	if !f.Generate { 
		continue
	} 
	
	// Top-level enums.
	for _, enum := range f.Enums { 
		data.Enums = append(data.Enums, EnumData{ 
			Name: enum.GoIdent.GoName, // Go type name.
		}) 
	} 
	
	// Nested enums from messages.
	for _, message := range f.Messages { 
		data.Enums = append(data.Enums, collectNestedEnums(message)...) 
	}
}
...

Since messages can contain other messages, we need to recursively check them and collect enums.

func collectNestedEnums(message *protogen.Message) []EnumData { 
	var enums []EnumData // Direct enums in this message.
	
	for _, enum := range message.Enums { 
		enums = append(enums, EnumData{Name: enum.GoIdent.GoName}) 
	} 
	
	// Recurse into nested messages.
	for _, nestedMessage := range message.Messages { 
		enums = append(enums, collectNestedEnums(nestedMessage)...) 
	} 
	
	return enums
}

Once collected, we will use text/template to generate our enum code. This is the template we will use.

var enumScannerTemplate = template.Must(template.New("enum_scanner").Parse(`
// Code generated by protoc-gen-enum-wrappers {{.Version}}. DO NOT EDIT.
package {{.PackageName}}

import (
    "database/sql/driver"
)

{{range .Enums}}
// Value implements driver.Valuer for database storage.
func (e {{.Name}}) Value() (driver.Value, error) {
    if stringVal, present := {{.Name}}_name[e]; present {
		return stringVal, nil
	}

	return nil, fmt.Errorf("%d is not a valid value", e)
}

// Scan implements sql.Scanner for database retrieval.
func (e *{{.Name}}) Scan(src any) error {

	// No point in trying to scan a nil value.
	if src == nil {
		return nil
	}

    asString, cast := src.(string)
	if !cast {
		return fmt.Errorf("%v is not a valid string", src)
	}

	int32Val, present := {{.Name}}_value[asString]
	if !present {
		return fmt.Errorf("%s is not a valid value", asString)
	}

	*e = int32Val
	return nil
}
{{end}}
`))

{{range .Enums}} helps us loop over our collected enums, and then it’s a matter of inserting the enum names in the right places.

The final part consists of executing our template and generating a file to store the results.

filename := file.GeneratedFilenamePrefix + "_enum_scanners.pb.go" 
g := gen.NewGeneratedFile(filename, file.GoImportPath)

The GeneratedFilenamePrefix comes from the original .proto filename. For user.proto, this creates user_enum_scanners.pb.go. The .pb.go suffix follows protobuf conventions.

var buf bytes.Buffer 
if err := enumScannerTemplate.Execute(&buf, data); err != nil { 
	...
} 

g.P(buf.String())

g.P() writes lines to the generated file.

And with that we’ve written our plugin to implement valuer/scanner on enums. The google.golang.org/protobuf/compiler/protogen package takes care of the rest, converting our output to a CodeGeneratorResponse, that is then read back by protoc.

Going back to our example:

idnameemailtype
dwedwedwdwdShaunion[email protected]SUPER
Much more readable!

More posts like this

Exploring How Protobuf OneOfs Are Represented

2024-12-05 | #golang #protobug

This is a short exploration of how Protobuf3 OneOf fields are represented using Golang as our exploration medium. OneOf types, aka Tagged Unions are data structures that are used to hold one of a finite list of distinct types. A variable of a tagged union type can hold a value of one of several types defined for that tagged union type. This might be easier to understand with an example via pseudocode.

Continue reading 


Protopath Problems in Go

2024-09-11 | #golang #grpc #protobuf

Failed to compute set of methods to expose - Symbol not found This cryptic error has reared its ugly head a few times in the past when working with grpc in Golang. I can’t remember how I’ve solved it before. Probably with some combination of stackoverflow answers and random github gists with advice that I have now forgotten. When it happened most recently, I decided to try to actually understand what was going on under the hood.

Continue reading 