cache

Imports

Imports #

"bytes"
"crypto/sha256"
"encoding/hex"
"errors"
"fmt"
"internal/godebug"
"io"
"io/fs"
"os"
"path/filepath"
"strconv"
"strings"
"time"
"cmd/go/internal/base"
"cmd/go/internal/lockedfile"
"cmd/go/internal/mmap"
"fmt"
"os"
"path/filepath"
"sync"
"cmd/go/internal/base"
"cmd/go/internal/cfg"
"bytes"
"crypto/sha256"
"fmt"
"hash"
"io"
"os"
"runtime"
"strings"
"sync"
"bufio"
"cmd/go/internal/base"
"cmd/go/internal/cacheprog"
"cmd/internal/quoted"
"context"
"crypto/sha256"
"encoding/base64"
"encoding/json"
"errors"
"fmt"
"internal/goexperiment"
"io"
"log"
"os"
"os/exec"
"sync"
"sync/atomic"
"time"

Constants & Variables

DebugTest var #

DebugTest is set when GODEBUG=gocachetest=1 is in the environment.

var DebugTest = false

HashSize const #

HashSize is the number of bytes in a hash.

const HashSize = 32

cacheREADME const #

cacheREADME is a message stored in a README in the cache directory. Because the cache lives outside the normal Go trees, we leave the README as a courtesy to explain where it came from.

const cacheREADME = `This directory holds cached build artifacts from the Go build system.
Run "go clean -cache" if the directory is getting too large.
Run "go clean -fuzzcache" to delete the fuzz cache.
See golang.org to learn more about Go.
`

debugHash var #

var debugHash = false

defaultDir var #

var defaultDir string

defaultDirChanged var #

var defaultDirChanged bool

defaultDirErr var #

var defaultDirErr error

defaultDirOnce var #

var defaultDirOnce sync.Once

entrySize const #

const entrySize = *ast.BinaryExpr

errCacheprogClosed var #

var errCacheprogClosed = *ast.CallExpr

errVerifyMode var #

var errVerifyMode = *ast.CallExpr

gocachehash var #

var gocachehash = *ast.CallExpr

gocachetest var #

var gocachetest = *ast.CallExpr

gocacheverify var #

var gocacheverify = *ast.CallExpr

hashDebug var #

In GODEBUG=gocacheverify=1 mode, hashDebug holds the input to every computed hash ID, so that we can work backward from the ID involved in a cache entry mismatch to a description of what should be there.

var hashDebug struct{...}

hashFileCache var #

var hashFileCache struct{...}

hashSalt var #

hashSalt is a salt string added to the beginning of every hash created by NewHash. Using the Go version makes sure that different versions of the go command (or even different Git commits during work on the development branch) do not address the same cache entries, so that a bug in one version does not affect the execution of other versions. This salt will result in additional ActionID files in the cache, but not additional copies of the large output files, which are still addressed by unsalted SHA256. We strip any GOEXPERIMENTs the go tool was built with from this version string on the assumption that they shouldn't affect go tool execution. This allows bootstrapping to converge faster: dist builds go_bootstrap without any experiments, so by stripping experiments go_bootstrap and the final go binary will use the same salt.

var hashSalt = *ast.CallExpr

hexSize const #

action entry file is "v1 \n"

const hexSize = *ast.BinaryExpr

initDefaultCacheOnce var #

var initDefaultCacheOnce = *ast.CallExpr

mtimeInterval const #

Time constants for cache expiration. We set the mtime on a cache file on each use, but at most one per mtimeInterval (1 hour), to avoid causing many unnecessary inode updates. The mtimes therefore roughly reflect "time of last use" but may in fact be older by at most an hour. We scan the cache for entries to delete at most once per trimInterval (1 day). When we do scan the cache, we delete entries that have not been used for at least trimLimit (5 days). Statistics gathered from a month of usage by Go developers found that essentially all reuse of cached entries happened within 5 days of the previous reuse. See golang.org/issue/22990.

const mtimeInterval = *ast.BinaryExpr

trimInterval const #

Time constants for cache expiration. We set the mtime on a cache file on each use, but at most one per mtimeInterval (1 hour), to avoid causing many unnecessary inode updates. The mtimes therefore roughly reflect "time of last use" but may in fact be older by at most an hour. We scan the cache for entries to delete at most once per trimInterval (1 day). When we do scan the cache, we delete entries that have not been used for at least trimLimit (5 days). Statistics gathered from a month of usage by Go developers found that essentially all reuse of cached entries happened within 5 days of the previous reuse. See golang.org/issue/22990.

const trimInterval = *ast.BinaryExpr

trimLimit const #

Time constants for cache expiration. We set the mtime on a cache file on each use, but at most one per mtimeInterval (1 hour), to avoid causing many unnecessary inode updates. The mtimes therefore roughly reflect "time of last use" but may in fact be older by at most an hour. We scan the cache for entries to delete at most once per trimInterval (1 day). When we do scan the cache, we delete entries that have not been used for at least trimLimit (5 days). Statistics gathered from a month of usage by Go developers found that essentially all reuse of cached entries happened within 5 days of the previous reuse. See golang.org/issue/22990.

const trimLimit = *ast.BinaryExpr

verify var #

verify controls whether to run the cache in verify mode. In verify mode, the cache always returns errMissing from Get but then double-checks in Put that the data being written exactly matches any existing entry. This provides an easy way to detect program behavior that would have been different had the cache entry been returned from Get. verify is enabled by setting the environment variable GODEBUG=gocacheverify=1.

var verify = false

Type Aliases

ActionID type #

An ActionID is a cache action key, the hash of a complete description of a repeatable computation (command line, environment variables, input file contents, executable contents).

type ActionID [HashSize]byte

OutputID type #

An OutputID is a cache output key, the hash of an output of a computation.

type OutputID [HashSize]byte

Interfaces

Cache interface #

Cache is the interface as used by the cmd/go.

type Cache interface {
Get(ActionID) (Entry, error)
Put(ActionID, io.ReadSeeker) (_ OutputID, size int64, _ error)
Close() error
OutputFile(OutputID) string
FuzzDir() string
}

Structs

DiskCache struct #

A Cache is a package cache, backed by a file system directory tree.

type DiskCache struct {
dir string
now func() time.Time
}

Entry struct #

type Entry struct {
OutputID OutputID
Size int64
Time time.Time
}

Hash struct #

A Hash provides access to the canonical hash function used to index the cache. The current implementation uses salted SHA256, but clients must not assume this.

type Hash struct {
h hash.Hash
name string
buf *bytes.Buffer
}

ProgCache struct #

ProgCache implements Cache via JSON messages over stdin/stdout to a child helper process which can then implement whatever caching policy/mechanism it wants. See https://github.com/golang/go/issues/59719

type ProgCache struct {
cmd *exec.Cmd
stdout io.ReadCloser
stdin io.WriteCloser
bw *bufio.Writer
jenc *json.Encoder
can map[cacheprog.Cmd]bool
fuzzDirCache Cache
closing atomic.Bool
ctx context.Context
ctxCancel context.CancelFunc
readLoopDone chan struct{...}
mu sync.Mutex
nextID int64
inFlight map[int64]chan<- *cacheprog.Response
outputFile map[OutputID]string
writeMu sync.Mutex
}

entryNotFoundError struct #

An entryNotFoundError indicates that a cache entry was not found, with an optional underlying reason.

type entryNotFoundError struct {
Err error
}

noVerifyReadSeeker struct #

noVerifyReadSeeker is an io.ReadSeeker wrapper sentinel type that says that Cache.Put should skip the verify check (from GODEBUG=goverifycache=1).

type noVerifyReadSeeker struct {
io.ReadSeeker
}

Functions

Close method #

func (c *ProgCache) Close() error

Close method #

func (c *DiskCache) Close() error

Default function #

Default returns the default cache to use. It never returns nil.

func Default() Cache

DefaultDir function #

DefaultDir returns the effective GOCACHE setting. It returns "off" if the cache is disabled, and reports whether the effective value differs from GOCACHE.

func DefaultDir() (string, bool)

Error method #

func (e *entryNotFoundError) Error() string

FileHash function #

FileHash returns the hash of the named file. It caches repeated lookups for a given file, and the cache entry for a file can be initialized using SetFileHash. The hash used by FileHash is not the same as the hash used by NewHash.

func FileHash(file string) ([HashSize]byte, error)

FuzzDir method #

func (c *ProgCache) FuzzDir() string

FuzzDir method #

FuzzDir returns a subdirectory within the cache for storing fuzzing data. The subdirectory may not exist. This directory is managed by the internal/fuzz package. Files in this directory aren't removed by the 'go clean -cache' command or by Trim. They may be removed with 'go clean -fuzzcache'. TODO(#48526): make Trim remove unused files from this directory.

func (c *DiskCache) FuzzDir() string

Get method #

Get looks up the action ID in the cache, returning the corresponding output ID and file size, if any. Note that finding an output ID does not guarantee that the saved file for that output ID is still available.

func (c *DiskCache) Get(id ActionID) (Entry, error)

Get method #

func (c *ProgCache) Get(a ActionID) (Entry, error)

GetBytes function #

GetBytes looks up the action ID in the cache and returns the corresponding output bytes. GetBytes should only be used for data that can be expected to fit in memory.

func GetBytes(c Cache, id ActionID) ([]byte, Entry, error)

GetFile function #

GetFile looks up the action ID in the cache and returns the name of the corresponding data file.

func GetFile(c Cache, id ActionID) (file string, entry Entry, err error)

GetMmap function #

GetMmap looks up the action ID in the cache and returns the corresponding output bytes. GetMmap should only be used for data that can be expected to fit in memory.

func GetMmap(c Cache, id ActionID) ([]byte, Entry, bool, error)

NewHash function #

NewHash returns a new Hash. The caller is expected to Write data to it and then call Sum.

func NewHash(name string) *Hash

Open function #

Open opens and returns the cache in the given directory. It is safe for multiple processes on a single machine to use the same cache directory in a local file system simultaneously. They will coordinate using operating system file locks and may duplicate effort but will not corrupt the cache. However, it is NOT safe for multiple processes on different machines to share a cache directory (for example, if the directory were stored in a network file system). File locking is notoriously unreliable in network file systems and may not suffice to protect the cache.

func Open(dir string) (*DiskCache, error)

OutputFile method #

OutputFile returns the name of the cache file storing output with the given OutputID.

func (c *DiskCache) OutputFile(out OutputID) string

OutputFile method #

func (c *ProgCache) OutputFile(o OutputID) string

Put method #

Put stores the given output in the cache as the output for the action ID. It may read file twice. The content of file must not change between the two passes.

func (c *DiskCache) Put(id ActionID, file io.ReadSeeker) (OutputID, int64, error)

Put method #

func (c *ProgCache) Put(a ActionID, file io.ReadSeeker) (_ OutputID, size int64, _ error)

PutBytes function #

PutBytes stores the given bytes in the cache as the output for the action ID.

func PutBytes(c Cache, id ActionID, data []byte) error

PutExecutable method #

PutExecutable is used to store the output as the output for the action ID into a file with the given base name, with the executable mode bit set. It may read file twice. The content of file must not change between the two passes.

func (c *DiskCache) PutExecutable(id ActionID, name string, file io.ReadSeeker) (OutputID, int64, error)

PutNoVerify function #

PutNoVerify is like Put but disables the verify check when GODEBUG=goverifycache=1 is set. It is meant for data that is OK to cache but that we expect to vary slightly from run to run, like test output containing times and the like.

func PutNoVerify(c Cache, id ActionID, file io.ReadSeeker) (OutputID, int64, error)

SetFileHash function #

SetFileHash sets the hash returned by FileHash for file.

func SetFileHash(file string, sum [HashSize]byte)

Subkey function #

Subkey returns an action ID corresponding to mixing a parent action ID with a string description of the subkey.

func Subkey(parent ActionID, desc string) ActionID

Sum method #

Sum returns the hash of the data written previously.

func (h *Hash) Sum() [HashSize]byte

Trim method #

Trim removes old cache entries that are likely not to be reused.

func (c *DiskCache) Trim() error

Unwrap method #

func (e *entryNotFoundError) Unwrap() error

Write method #

Write writes data to the running hash.

func (h *Hash) Write(b []byte) (int, error)

copyFile method #

copyFile copies file into the cache, expecting it to have the given output ID and size, if that file is not present already.

func (c *DiskCache) copyFile(file io.ReadSeeker, executableName string, out OutputID, size int64, perm os.FileMode) error

fileName method #

fileName returns the name of the file corresponding to the given id.

func (c *DiskCache) fileName(id [HashSize]byte, key string) string

get method #

get is Get but does not respect verify mode, so that Put can use it.

func (c *DiskCache) get(id ActionID) (Entry, error)

init function #

func init()

initDefaultCache function #

initDefaultCache does the work of finding the default cache the first time Default is called.

func initDefaultCache() Cache

initEnv function #

func initEnv()

markUsed method #

markUsed makes a best-effort attempt to update mtime on file, so that mtime reflects cache access time. Because the reflection only needs to be approximate, and to reduce the amount of disk activity caused by using cache entries, used only updates the mtime if the current mtime is more than an hour old. This heuristic eliminates nearly all of the mtime updates that would otherwise happen, while still keeping the mtimes useful for cache trimming. markUsed reports whether the file is a directory (an executable cache entry).

func (c *DiskCache) markUsed(file string) (isDir bool)

noteOutputFile method #

func (c *ProgCache) noteOutputFile(o OutputID, diskPath string)

put method #

func (c *DiskCache) put(id ActionID, executableName string, file io.ReadSeeker, allowVerify bool) (OutputID, int64, error)

putIndexEntry method #

putIndexEntry adds an entry to the cache recording that executing the action with the given id produces an output with the given output id (hash) and size.

func (c *DiskCache) putIndexEntry(id ActionID, out OutputID, size int64, allowVerify bool) error

readLoop method #

func (c *ProgCache) readLoop(readLoopDone chan<- struct{...})

reverseHash function #

reverseHash returns the input used to compute the hash id.

func reverseHash(id [HashSize]byte) string

send method #

func (c *ProgCache) send(ctx context.Context, req *cacheprog.Request) (*cacheprog.Response, error)

startCacheProg function #

startCacheProg starts the prog binary (with optional space-separated flags) and returns a Cache implementation that talks to it. It blocks a few seconds to wait for the child process to successfully start and advertise its capabilities.

func startCacheProg(progAndArgs string, fuzzDirCache Cache) Cache

stripExperiment function #

stripExperiment strips any GOEXPERIMENT configuration from the Go version string.

func stripExperiment(version string) string

trimSubdir method #

trimSubdir trims a single cache subdirectory.

func (c *DiskCache) trimSubdir(subdir string, cutoff time.Time)

writeToChild method #

func (c *ProgCache) writeToChild(req *cacheprog.Request, resc chan<- *cacheprog.Response) (err error)

Generated with Arrow