Building flow compiler for sending campaigns and auto reply
At work, I’m working on the flow compiler, which compile from flow documents into ETA flow descriptors. The use cases are to provide our users the ability to automatically response to their customers’ messages. Users can define their own flow and customize auto reply behaviours. This article gives a look into my work. Also see related article.
Package flow implements the flow document compiler and runtime controller. The compiler is responsible for parsing the flow document and compiling it into a specific format for that flow document type, and can be executed by a runtime engine. Currently, there are two compiler engines: auto reply and campaign. The compilation process is divided into five phases: scanner, parser, analyzer, generating ir, and generating output.
Package Structure
List of compiler engines:
- AutoReply: Compile flow documents with type AUTO_REPLY to
*etapb.FlowDescriptor
. Use case: to reply automatically to customer messages. - Campaign: Compile flow document with type SENDOUT to
*etapb.FlowDescriptor
and campaign configuration. Use case: to send out messages to customers and reply to their responses.
The general structure:
- compiler: Contains the generic compiler infrastructure implementation, which can be shared by different compiler engines.
- compilers: Contains the compiler engines implementation, which is responsible for compiling each type of flow documents into specific formats.
- registry: Contains the registry implementation, which is responsible for registering different type of objects for the purposes of scanner, parser, analyzer, and different compiler engines.
- objects: Provides the objects which are used by the compiler engines, and their customized behaviors. Each document type has its own list of available objects. They are grouped by document type and registered in the registry.
The structure of shared packages:
- compiler/diagnostics: The diagnostic implementation, which is a list of errors, warnings, and notes. These diagnostics can be produced by any step of the compilers.
- compiler/syntax: Provide the tokens in form of syntax.TypedFlowObject and ast in form of syntax.Node and syntax.Edge interfaces.
- compiler/scanner: The lexer implementation, which receives a flow document and produces syntax.TypedFlowDocument.
- compiler/parser: The parser implementation, which receives a syntax.TypedFlowDocument and produces syntax.ParsedFlowDocument.
- compiler/dependency: The dependency resolver implementation, dependency.Resolver, which is an interface that can be used to resolve dependencies in compiler engines and mocked in tests.
- compiler/analyzer: The static and dependency analyzer implementation, which executes on syntax.ParsedFlowDocument and dependency.Resolver.
The structure of each compiler engine:
- ir: The intermediate representation of the flow document, which is used by the compiler engine during the compilation. engine: The engine implementation, which is responsible for compiling the flow document into a specific format for that flow document type and runtime engine.
- cctx: The context and state using during the compilation, which includes the logger, the diagnostic, dependency resolver, and the intermediate representation.
Importing constraints:
- compiler: The innermost layer. Can NOT import any other layers.
- registry: Can import compiler.
- compilers: Can import registry and compiler packages. Each compiler engine must NOT import other compiler engines’ code.
- objects: The outtermost layer. Can import registry, compiler, and compilers packages.
Token
The flow document protobuf is first parsed into a syntax.TypedFlowDocument
, which is a list of syntax.TypedFlowObject
. Each syntax.TypedFlowObject
is a token, which is a struct with id, type and value for that specific object type. The value is parsed into a specific typed protobuf message grouping under an oneof. From that value, we can access the property and data for that specific object type.
package syntax
type TypedFlowObject struct {
ID uuid.ID
Type FlowObjectType
Model struct { // oneof
SimpleEdge struct {
Source string
Target string
Data SimpleEdgeData
}
CustomAutoReply struct{}
IncomingRoomEvent struct{}
SendWaMessage struct{}
// ...
}
}
Package scanner implement the Scan function, which receives a flow document and produces a syntax.TypedFlowDocument
:
package scanner
func Scan(diag *diagx.Result, doc *flowpb.FlowDocument) (out *syntax.TypedFlowDocument)
package syntax
type TypedFlowDocument struct {
FlowDoc *flowpb.FlowDocument
ObjByID map[uuid.UUID]*flowpb.FlowObject
TypedObjByID map[uuid.UUID]*TypedFlowObject
PropByObjID map[uuid.UUID][]*flowpb.FlowObjectProperty
PropByKey map[Position]*flowpb.FlowObjectProperty
}
AST
The syntax.TypedFlowDocument
is then parsed into a syntax.ParsedFlowDocument
, which is a list of syntax.Node
and syntax.Edge
interface. They are the abstract syntax tree (AST) of the flow document. The syntax.Node
interface represents a node in the flow document. It contains the id, type, and the list of incoming and outgoing edges. It also reports whether it is visited or not so the compiler engine can decide to create a new event/action or trigger a previous action.
package syntax
type Node interface {
GetID() uuid.UUID
GetType() flowpb.FlowObjectType
GetCore() *NodeCore
GetTypedFlowObject() *TypedFlowObject
GetVisited() bool
GetIncomingEdges() []Edge
GetOutgoingEdges() []Edge
}
type NodeCore struct {
*TypedFlowObject
Visited bool
IncomingEdges []Edge
OutgoingEdges []Edge
}
The syntax.Edge
interface represents an edge in the flow document. It connects two nodes in a direction from source node to target node. A syntax.Node can have multiple incoming and outgoing edges. Each outgoing syntax.Edge
can have different source handle and target handle to indicate what should happen by that edge. For example, an edge trigger after a SendWaMessage
action node, with the source handle is button-click:1
to indicate that this edge should be followed when the customer clicks the first button in that template or interaction message.
package syntax
type Edge interface {
GetID() uuid.UUID
GetType() flowpb.FlowObjectType
GetCore() *EdgeCore
GetTypedFlowObject() *TypedFlowObject
GetSourceID() uuid.UUID
GetTargetID() uuid.UUID
GetSrcHandle() string
GetTgtHandle() string
GetStartNode() Node
GetEndNode() Node
}
type EdgeCore struct {
*TypedFlowObject // simple edge
SourceID uuid.UUID
TargetID uuid.UUID
SrcHandle string // prefix trimmed
TgtHandle string // prefix trimmed
StartNode Node
EndNode Node
}
The syntax.Node
interface are extended by registry.Node
interface, which adds a few additional interfaces to help scanner initialize the node or analyzer validate the node type. It is extended further by autoreply.Node
interface which packs the AutoReplyKind()
method together to indicate the kind of the node. It is then extended again by autoreply.EventNode
and autoreply.ActionNode
interfaces to provide additional behavior code for each kind of node: start, event, or action.
package registry
type Object interface {
scanner.ObjectInitializer
analyzer.ObjectValidator
}
type Node interface {
Object
syntax.Node
}
package autoreply
type Node interface {
registry.Node
AutoReplyKind() ir.AutoReplyKind
}
type EventNode interface {
Node
ProcessEvent(cc *cctx.CompilationContext, branch *ir.Branch) (*etapb.EventDescriptor, *ir.AvailableBranches)
}
type ActionNode interface {
Node
ProcessAction(cc *cctx.CompilationContext, branch *ir.Branch) (*etapb.ActionDescriptor, *ir.AvailableBranches)
}
Package parser implements the Parse function, which receives a syntax.TypedFlowDocument
and produces a syntax.ParsedFlowDocument
, which is a list of syntax.Node
and syntax.Edge
:
package parser
func Parse(diag *diagx.Result, in *syntax.TypedFlowDocument) *syntax.ParsedFlowDocument
package syntax
type ParsedFlowDocument struct {
Root Node
Nodes []Node
Edges []Edge
}
IR
Each compiler engine implements its own set of intermediate representation (IR) for supporting the compilation process. The auto reply engine implements the following IR in the package compilers/autoreply/ir:
package ir
type Document struct {
Root *Branch
}
type Branch struct {
Edge syntax.Edge
Event Event
Trigger Trigger
Action Action
GotoBranch *Branch
TriggerBranch *Branch
Branches []*Branch
}
The ir.Branch
type is the main IR type for the auto reply engine. It represents a branch in the output ETA. Each branch may be a single edge or multiple edge starting from a trigger (start node or template/interactive send wa message action node) going through an event node and arriving at another action node. The main job of the compilation engine is to convert the syntax.ParsedFlowDocument
into a ir.Document and then convert the ir.Document
into an ETA descriptor, the final output.
Object registry
Each phase of the compilation process produces a list of various types of objects implements a specific interface. For example, the scanner produces a list of syntax.TypedFlowObject
which can be SimpleEdge
, CustomAutoReply
, or IncomingRoomEvent
, etc. The parser produces a list of syntax.Node
which can be CustomAutoReply
, IncomingRoomEvent
, or SendWaMessage
, etc. And each type of document has a different set of available building blocks. For example, the auto reply document uses CustomAutoReply
while the campaign document uses CustomSendCampaign
. Therefore, the registry package is required to keep track of all the available building blocks for each type of document and provide a way to initialize and validate them.
The scanner.ObjectInitializer
interface is used to initialize the syntax.TypedFlowObject
with various types. The initializer prepares a container proto.Message
of the correct type to be fed to the scanner:
package registry
type ObjectInitializer interface {
InitTypedFlowObject(out *syntax.TypedFlowObject) proto.Message
}
func RegisterObjectInitializer(typ flowpb.FlowObjectType, initializer ObjectInitializer) {
objectInitializers[typ] = initializer
}
The parser.DocumentClassifier
interface is used to provide each the flow document type a set of available building blocks. And each building block implements parser.ObjectConstructor
interface to be used by the parser to create the specific syntax.Node
and syntax.Edge implementation. For example, the NewNode()
method of the incoming room event implementation of parser.ObjectConstructor may receive a syntax.NodeCore
with type FLOW_OBJECT_TYPE_INCOMING_ROOM_EVENT
and return an incoming_room_event.IncomingRoomEvent
event node wrapping around that core:
package registry
type DocumentClassifier interface {
Category() flowpb.FlowDocumentCategory
DocumentType() flowpb.FlowDocumentType
StartNodeType() flowpb.FlowObjectType
AcceptNodeType(flowpb.FlowObjectType) bool
}
type ObjectConstructor interface {
IsEdge() bool
NewEdge(*syntax.EdgeCore) syntax.Edge
NewNode(*syntax.NodeCore) syntax.Node
}
Each compiler engine provides its own node classifier to further separate the kind of its own node types. For example, the auto reply engine provides the following node classifier:
package autoreply
func ValidateNodeClassifier(sample syntax.Node) error {
kindGetter, ok := sample.(Node)
if !ok {
return errors.Newf("node %T does not implement autoreply.KindGetter", sample)
}
switch kindGetter.AutoReplyKind() {
case ir.KindStart:
// nothing
case ir.KindEvent:
_, ok = sample.(EventNode)
if !ok {
return errors.Newf("event node %T does not implement autoreply.EventNode", sample)
}
case ir.KindAction:
_, ok = sample.(ActionNode)
if !ok {
return errors.Newf("action node %T does not implement autoreply.ActionNode", sample)
}
}
}
The registry packages take care of registering the available building blocks for each type of document, feeding the scanner and parser package the corresponding initializers and constructors they need. This help keep the scanner and parser code clean without worrying about the details of each type of document or object. The autoreply engine can cast the syntax.Node
into autoreply.EventNode
or autoreply.ActionNode
to access the specific methods it needs for that kind of object.
Let's stay connected!
Author
I'm Oliver Nguyen. A software maker working mostly in Go and JavaScript. I enjoy learning and seeing a better version of myself each day. Occasionally spin off new open source projects. Share knowledge and thoughts during my journey. Connect with me on , , , , or subscribe to my posts.