hub / github.com/klauspost/compress / encodeBlockDictGo

Function encodeBlockDictGo

s2/encode_all.go:890–1477 · view source on GitHub ↗

encodeBlockGo encodes a non-empty src to a guaranteed-large-enough dst. It assumes that the varint-encoded length of the decompressed bytes has already been written. It also assumes that: len(dst) >= MaxEncodedLen(len(src)) && minNonLiteralBlockSize <= len(src) && len(src) <= maxBlockSize

(dst, src []byte, dict *Dict)

Source from the content-addressed store, hash-verified

888	// len(dst) >= MaxEncodedLen(len(src)) &&
889	// minNonLiteralBlockSize <= len(src) && len(src) <= maxBlockSize
890	func encodeBlockDictGo(dst, src []byte, dict *Dict) (d int) {
891	// Initialize the hash table.
892	const (
893	tableBits = 14
894	maxTableSize = 1 << tableBits
895	maxAhead = 8 // maximum bytes ahead without checking sLimit
896
897	debug = false
898	)
899	dict.initFast()
900
901	var table [maxTableSize]uint32
902
903	// sLimit is when to stop looking for offset/length copies. The inputMargin
904	// lets us use a fast path for emitLiteral in the main loop, while we are
905	// looking for copies.
906	sLimit := min(len(src)-inputMargin, MaxDictSrcOffset-maxAhead)
907
908	// Bail if we can't compress to at least this.
909	dstLimit := len(src) - len(src)>>5 - 5
910
911	// nextEmit is where in src the next emitLiteral should start from.
912	nextEmit := 0
913
914	// The encoded form can start with a dict entry (copy or repeat).
915	s := 0
916
917	// Convert dict repeat to offset
918	repeat := len(dict.dict) - dict.repeat
919	cv := load64(src, 0)
920
921	// While in dict
922	searchDict:
923	for {
924	// Next src position to check
925	nextS := s + (s-nextEmit)>>6 + 4
926	hash0 := hash6(cv, tableBits)
927	hash1 := hash6(cv>>8, tableBits)
928	if nextS > sLimit {
929	if debug {
930	fmt.Println("slimit reached", s, nextS)
931	}
932	break searchDict
933	}
934	candidateDict := int(dict.fastTable[hash0])
935	candidateDict2 := int(dict.fastTable[hash1])
936	candidate2 := int(table[hash1])
937	candidate := int(table[hash0])
938	table[hash0] = uint32(s)
939	table[hash1] = uint32(s + 1)
940	hash2 := hash6(cv>>16, tableBits)
941
942	// Check repeat at offset checkRep.
943	const checkRep = 1
944
945	if repeat > s {
946	candidate := len(dict.dict) - repeat + s
947	if repeat-s >= 4 && uint32(cv) == load32(dict.dict, candidate) {

Callers 2

TestDictFunction · 0.85

EncodeMethod · 0.85

Calls 7

initFastMethod · 0.80

load64Function · 0.70

hash6Function · 0.70

load32Function · 0.70

emitLiteralFunction · 0.70

emitRepeatFunction · 0.70

emitCopyFunction · 0.70

Tested by 1

TestDictFunction · 0.68

Used in the wild real call sites across dependent graphs

searching dependent graphs…