MCPcopy
hub / github.com/klauspost/compress / encodeBlockDictGo

Function encodeBlockDictGo

s2/encode_all.go:890–1477  ·  view source on GitHub ↗

encodeBlockGo encodes a non-empty src to a guaranteed-large-enough dst. It assumes that the varint-encoded length of the decompressed bytes has already been written. It also assumes that: len(dst) >= MaxEncodedLen(len(src)) && minNonLiteralBlockSize <= len(src) && len(src) <= maxBlockSize

(dst, src []byte, dict *Dict)

Source from the content-addressed store, hash-verified

888// len(dst) >= MaxEncodedLen(len(src)) &&
889// minNonLiteralBlockSize <= len(src) && len(src) <= maxBlockSize
890func encodeBlockDictGo(dst, src []byte, dict *Dict) (d int) {
891 // Initialize the hash table.
892 const (
893 tableBits = 14
894 maxTableSize = 1 << tableBits
895 maxAhead = 8 // maximum bytes ahead without checking sLimit
896
897 debug = false
898 )
899 dict.initFast()
900
901 var table [maxTableSize]uint32
902
903 // sLimit is when to stop looking for offset/length copies. The inputMargin
904 // lets us use a fast path for emitLiteral in the main loop, while we are
905 // looking for copies.
906 sLimit := min(len(src)-inputMargin, MaxDictSrcOffset-maxAhead)
907
908 // Bail if we can't compress to at least this.
909 dstLimit := len(src) - len(src)>>5 - 5
910
911 // nextEmit is where in src the next emitLiteral should start from.
912 nextEmit := 0
913
914 // The encoded form can start with a dict entry (copy or repeat).
915 s := 0
916
917 // Convert dict repeat to offset
918 repeat := len(dict.dict) - dict.repeat
919 cv := load64(src, 0)
920
921 // While in dict
922searchDict:
923 for {
924 // Next src position to check
925 nextS := s + (s-nextEmit)>>6 + 4
926 hash0 := hash6(cv, tableBits)
927 hash1 := hash6(cv>>8, tableBits)
928 if nextS > sLimit {
929 if debug {
930 fmt.Println("slimit reached", s, nextS)
931 }
932 break searchDict
933 }
934 candidateDict := int(dict.fastTable[hash0])
935 candidateDict2 := int(dict.fastTable[hash1])
936 candidate2 := int(table[hash1])
937 candidate := int(table[hash0])
938 table[hash0] = uint32(s)
939 table[hash1] = uint32(s + 1)
940 hash2 := hash6(cv>>16, tableBits)
941
942 // Check repeat at offset checkRep.
943 const checkRep = 1
944
945 if repeat > s {
946 candidate := len(dict.dict) - repeat + s
947 if repeat-s >= 4 && uint32(cv) == load32(dict.dict, candidate) {

Callers 2

TestDictFunction · 0.85
EncodeMethod · 0.85

Calls 7

initFastMethod · 0.80
load64Function · 0.70
hash6Function · 0.70
load32Function · 0.70
emitLiteralFunction · 0.70
emitRepeatFunction · 0.70
emitCopyFunction · 0.70

Tested by 1

TestDictFunction · 0.68

Used in the wild real call sites across dependent graphs

searching dependent graphs…