Skip to content

Language column in language_stat table is too small (needs to be at least 34) #12379

@somera

Description

@somera
  • Gitea version (or commit ref): 1.12.3
  • Git version: 2.25.1
  • Operating system: Linux nuc-mini-server 5.4.0-42-generic API endpoints for stars #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No
    • Not relevant
  • Log gist:
2020/07/31 11:33:25 ...m.io/xorm/core/tx.go:157:QueryContext() [I] [SQL] INSERT INTO "language_stat" ("repo_id","commit_id","is_primary","language","size","created_unix") VALUES ($1, $2, $3, $4, $5, $6) RETURNING "id" [786 dd0f55d6cee3bcf7d483522684933dc73f6b1831 true Glyph Bitmap Distribution Format 141557 1596188005] - 259.548µs
2020/07/31 11:33:25 ...o/xorm/session_tx.go:46:Rollback() [I] [SQL] ROLL BACK [] - 133.808µs
2020/07/30 19:20:46 ...dexer/stats/queue.go:24:handle() [E] stats queue idexer.Index(786) failed: pq: Wert z                                                                                               u lang für Typ character varying(30)
2020/07/31 11:33:25 ...dexer/stats/queue.go:24:handle() [E] stats queue idexer.Index(786) failed: pq: Wert zu lang für Typ character varying(30)

Enry v2 can detect languages with names longer than the currently provided maximum of 30 characters.

A quick look at Enry's source code demonstrates that the current maximum length for a detected language is 34 characters (see below code).

We therefore need to provide a migration to increase the size of this column or consider forcibly shortening Enry's detected language.


package main

import (
	"fmt"

	"github.com/go-enry/go-enry/v2/data"
)

func main() {
	maxLangLen := 0
	maxLang := ""
	for lang := range data.ExtensionsByLanguage {
		if len(lang) > maxLangLen {
			maxLang = lang
			maxLangLen = len(lang)
		}
	}
	for _, vals := range data.LanguagesByExtension {
		for _, lang := range vals {
			if len(lang) > maxLangLen {
				maxLang = lang
				maxLangLen = len(lang)
			}
		}
	}
	for lang := range data.LanguagesLogProbabilities {
		if len(lang) > maxLangLen {
			maxLang = lang
			maxLangLen = len(lang)
		}
	}
	fmt.Println("Max", maxLangLen, maxLang)
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions