"board/git@git.kontron-electronics.de:fmayer/u-boot.git" did not exist on "88077715d8d81825605028f2040b17137513f858"
- Jan 14, 2025
-
-
[ Upstream commit 231825b2e1ff6ba799c5eaf396d3ab2354e37c6b ] This reverts commit 5c26d2f1. It turns out that we can't do this, because while the old behavior of ignoring ignorable code points was most definitely wrong, we have case-folding filesystems with on-disk hash values with that wrong behavior. So now you can't look up those names, because they hash to something different. Of course, it's also entirely possible that in the meantime people have created *new* files with the new ("more correct") case folding logic, and reverting will just make other things break. The correct solution is to not do case folding in filesystems, but sadly, people seem to never really understand that. People still see it as a feature, not a bug. Reported-by:
Qi Han <hanqi@vivo.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586 Cc: Gabriel Krisman Bertazi <krisman@suse.de> Requested-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
[ Upstream commit 156bb2c569cd869583c593d27a5bd69e7b2a4264 ] utf8_load() requests the symbol "utf8_data_table" and then checks if the requested UTF-8 version is supported. If it's unsupported, it tries to put the data table using symbol_put(). If an unsupported version is requested, symbol_put() fails like this: kernel BUG at kernel/module/main.c:786! RIP: 0010:__symbol_put+0x93/0xb0 Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? die+0x2e/0x50 ? do_trap+0xca/0x110 ? do_error_trap+0x65/0x80 ? __symbol_put+0x93/0xb0 ? exc_invalid_op+0x51/0x70 ? __symbol_put+0x93/0xb0 ? asm_exc_invalid_op+0x1a/0x20 ? __pfx_cmp_name+0x10/0x10 ? __symbol_put+0x93/0xb0 ? __symbol_put+0x62/0xb0 utf8_load+0xf8/0x150 That happens because symbol_put() expects the unique string that identify the symbol, instead of a pointer to the loaded symbol. Fix that by using such string. Fixes: 2b3d0478 ("unicode: Add utf8-data module") Signed-off-by:
André Almeida <andrealmeid@igalia.com> Reviewed-by:
Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20240902225511.757831-2-andrealmeid@igalia.com Signed-off-by:
Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
commit 5c26d2f1 upstream. We don't need to handle them separately. Instead, just let them decompose/casefold to themselves. Signed-off-by:
Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Feb 14, 2022
-
-
Masahiro Yamada authored
cmd_copy and cmd_shipped have similar functionality. The difference is that cmd_copy uses 'cp' while cmd_shipped 'cat'. Unify them into cmd_copy because this macro name is more intuitive. Going forward, cmd_copy will use 'cat' to avoid the permission issue. I also thought of 'cp --no-preserve=mode' but this option is not mentioned in the POSIX spec [1], so I am keeping the 'cat' command. [1]: https://pubs.opengroup.org/onlinepubs/009695299/utilities/cp.html Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
- Jan 21, 2022
-
-
Christoph Hellwig authored
Turn the CONFIG_UNICODE symbol into a tristate that generates some always built in code and remove the confusing CONFIG_UNICODE_UTF8_DATA symbol. Note that a lot of the IS_ENABLED() checks could be turned from cpp statements into normal ifs, but this change is intended to be fairly mechanic, so that should be cleaned up later. Fixes: 2b3d0478 ("unicode: Add utf8-data module") Reported-by:
Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by:
Eric Biggers <ebiggers@google.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
- Jan 17, 2022
-
-
Linus Torvalds authored
Commit 2b3d0478 ("unicode: Add utf8-data module") changed the generated utf8data file from 'utf8data.h' to 'utf8data.c', but didn't change the comments or the .gitignore to match. The comments should be updated too, but at least they don't cause any visible breakage. But the gitignore file needs changing to avoid git complaining about untracked files. Fixes: 2b3d0478 ("unicode: Add utf8-data module") Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Oct 12, 2021
-
-
Christoph Hellwig authored
The exported symbols in utf8-norm.c are not needed for normal file system consumers, so move them to conditional _GPL exports just for the selftest. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
Christoph Hellwig authored
utf8data.h contains a large database table which is an auto-generated decodification trie for the unicode normalization functions. Allow building it into a separate module. Based on a patch from Shreeya Patel <shreeya.patel@collabora.com>. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
- Oct 11, 2021
-
-
Christoph Hellwig authored
Instead of repeatedly looking up the version add pointers to the NFD and NFD+CF tables to struct unicode_map, and pass a unicode_map plus index to the functions using the normalization tables. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
Christoph Hellwig authored
Only used by the tests, so no need to keep it in the core. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
Christoph Hellwig authored
Just use the utf8nlen implementation with a (size_t)-1 len argument, similar to utf8_lookup. Also move the function to utf8-selftest.c, as it isn't used anywhere else. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
Christoph Hellwig authored
No actually used anywhere. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
Christoph Hellwig authored
Don't bother with pointless string parsing when the caller can just pass the version in the format that the core expects. Also remove the fallback to the latest version that none of the callers actually uses. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
Christoph Hellwig authored
It is hardcoded and only used for a f2fs sysfs file where it can be hardcoded just as easily. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
- May 01, 2021
-
-
Masahiro Yamada authored
The pattern prefixed with '/' matches files in the same directory, but not ones in sub-directories. Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Acked-by:
Miguel Ojeda <ojeda@kernel.org> Acked-by:
Rob Herring <robh@kernel.org> Acked-by:
Andra Paraschiv <andraprs@amazon.com> Acked-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
- Sep 10, 2020
-
-
Daniel Rosenberg authored
This adds a case insensitive hash function to allow taking the hash without needing to allocate a casefolded copy of the string. The existing d_hash implementations for casefolding allocate memory within rcu-walk, by avoiding it we can be more efficient and avoid worrying about a failed allocation. Signed-off-by:
Daniel Rosenberg <drosen@google.com> Reviewed-by:
Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by:
Eric Biggers <ebiggers@google.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- Mar 25, 2020
-
-
Masahiro Yamada authored
Add SPDX License Identifier to all .gitignore files. Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Feb 03, 2020
-
-
Masahiro Yamada authored
In old days, the "host-progs" syntax was used for specifying host programs. It was renamed to the current "hostprogs-y" in 2004. It is typically useful in scripts/Makefile because it allows Kbuild to selectively compile host programs based on the kernel configuration. This commit renames like follows: always -> always-y hostprogs-y -> hostprogs So, scripts/Makefile will look like this: always-$(CONFIG_BUILD_BIN2C) += ... always-$(CONFIG_KALLSYMS) += ... ... hostprogs := $(always-y) $(always-m) I think this makes more sense because a host program is always a host program, irrespective of the kernel configuration. We want to specify which ones to compile by CONFIG options, so always-y will be handier. The "always", "hostprogs-y", "hostprogs-m" will be kept for backward compatibility for a while. Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org>
-
- Sep 17, 2019
-
-
Colin Ian King authored
Don't populate the array 'token' on the stack but instead make it static const. Makes the object code smaller by 234 bytes. Before: text data bss dec hex filename 5371 272 0 5643 160b fs/unicode/utf8-core.o After: text data bss dec hex filename 5041 368 0 5409 1521 fs/unicode/utf8-core.o (gcc version 9.2.1, amd64) Signed-off-by:
Colin Ian King <colin.king@canonical.com> Reviewed-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
Krzysztof Wilczynski authored
Move the static keyword to the front of declarations of nfdi_test_data and nfdicf_test_data, and resolve the following compiler warnings that can be seen when building with warnings enabled (W=1): fs/unicode/utf8-selftest.c:38:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] fs/unicode/utf8-selftest.c:92:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] Signed-off-by:
Krzysztof Wilczynski <kw@linux.com> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com>
-
- Jun 20, 2019
-
-
Gabriel Krisman Bertazi authored
Temporarily cache a casefolded version of the file name under lookup in ext4_filename, to avoid repeatedly casefolding it. I got up to 30% speedup on lookups of large directories (>100k entries), depending on the length of the string under lookup. Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-
- Jun 05, 2019
-
-
Thomas Gleixner authored
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation this program is distributed in the hope that it [would] be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 9 file(s). Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Allison Randal <allison@lohutok.net> Reviewed-by:
Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.804956444@linutronix.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
Based on 1 normalized pattern(s): this software is licensed under the terms of the gnu general public license version 2 as published by the free software foundation and may be copied distributed and modified under those terms this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 285 file(s). Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Alexios Zavras <alexios.zavras@intel.com> Reviewed-by:
Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141900.642774971@linutronix.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- May 21, 2019
-
-
Thomas Gleixner authored
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- May 12, 2019
-
-
Theodore Ts'o authored
Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
-
Theodore Ts'o authored
Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
-
- Apr 28, 2019
-
-
Masahiro Yamada authored
scripts/mkutf8data is used only when regenerating utf8data.h, which never happens in the normal kernel build. However, it is irrespectively built if CONFIG_UNICODE is enabled. Moreover, there is no good reason for it to reside in the scripts/ directory since it is only used in fs/unicode/. Hence, move it from scripts/ to fs/unicode/. In some cases, we bypass build artifacts in the normal build. The conventional way to do so is to surround the code with ifdef REGENERATE_*. For example, - 7373f4f8 ("kbuild: add implicit rules for parser generation") - 6aaf49b4 ("crypto: arm,arm64 - Fix random regeneration of S_shipped") I rewrote the rule in a more kbuild'ish style. In the normal build, utf8data.h is just shipped from the check-in file. $ make [ snip ] SHIPPED fs/unicode/utf8data.h CC fs/unicode/utf8-norm.o CC fs/unicode/utf8-core.o CC fs/unicode/utf8-selftest.o AR fs/unicode/built-in.a If you want to generate utf8data.h based on UCD, put *.txt files into fs/unicode/, then pass REGENERATE_UTF8DATA=1 from the command line. The mkutf8data tool will be automatically compiled to generate the utf8data.h from the *.txt files. $ make REGENERATE_UTF8DATA=1 [ snip ] HOSTCC fs/unicode/mkutf8data GEN fs/unicode/utf8data.h CC fs/unicode/utf8-norm.o CC fs/unicode/utf8-core.o CC fs/unicode/utf8-selftest.o AR fs/unicode/built-in.a I renamed the check-in utf8data.h to utf8data.h_shipped so that this will work for the out-of-tree build. You can update it based on the latest UCD like this: $ make REGENERATE_UTF8DATA=1 fs/unicode/ $ cp fs/unicode/utf8data.h fs/unicode/utf8data.h_shipped Also, I added entries to .gitignore and dontdiff. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-
- Apr 25, 2019
-
-
Gabriel Krisman Bertazi authored
Regenerate utf8data.h based on the latest UCD files and run tests against the latest version. Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-
Gabriel Krisman Bertazi authored
This implements a in-kernel sanity test module for the utf8 normalization core. At probe time, it will run basic sequences through the utf8n core, to identify problems will equivalent sequences and normalization/casefold code. This is supposed to be useful for regression testing when adding support for a new version of utf8 to linux. Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-
Gabriel Krisman Bertazi authored
This patch integrates the utf8n patches with some higher level API to perform UTF-8 string comparison, normalization and casefolding operations. Implemented is a variation of NFD, and casefold is performed by doing full casefold on top of NFD. These algorithms are based on the core implemented by Olaf Weber from SGI. Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-
Olaf Weber authored
Remove the Hangul decompositions from the utf8data trie, and do algorithmic decomposition to calculate them on the fly. To store the decomposition the caller of utf8lookup()/utf8nlookup() must provide a 12-byte buffer, which is used to synthesize a leaf with the decomposition. This significantly reduces the size of the utf8data[] array. Changes made by Gabriel: Rebase to mainline Fix checkpatch errors Extract robustness fixes and merge back to original mkutf8data.c patch Regenerate utf8data.h Signed-off-by:
Olaf Weber <olaf@sgi.com> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-
Olaf Weber authored
Supporting functions for UTF-8 normalization are in utf8norm.c with the header utf8norm.h. Two normalization forms are supported: nfdi and nfdicf. nfdi: - Apply unicode normalization form NFD. - Remove any Default_Ignorable_Code_Point. nfdicf: - Apply unicode normalization form NFD. - Remove any Default_Ignorable_Code_Point. - Apply a full casefold (C + F). For the purposes of the code, a string is valid UTF-8 if: - The values encoded are 0x1..0x10FFFF. - The surrogate codepoints 0xD800..0xDFFFF are not encoded. - The shortest possible encoding is used for all values. The supporting functions work on null-terminated strings (utf8 prefix) and on length-limited strings (utf8n prefix). From the original SGI patch and for conformity with coding standards, the utf8data_t typedef was dropped, since it was just masking the struct keyword. On other occasions, namely utf8leaf_t and utf8trie_t, I decided to keep it, since they are simple pointers to memory buffers, and using uchars here wouldn't provide any more meaningful information. From the original submission, we also converted from the compatibility form to canonical. Changes made by Gabriel: Rebase to Mainline Fix up checkpatch.pl warnings Drop typedefs move out of libxfs Convert from NFKD to NFD Signed-off-by:
Olaf Weber <olaf@sgi.com> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-
Gabriel Krisman Bertazi authored
The decomposition and casefolding of UTF-8 characters are described in a prefix tree in utf8data.h, which is a generate from the Unicode Character Database (UCD), published by the Unicode Consortium, and should not be edited by hand. The structures in utf8data.h are meant to be used for lookup operations by the unicode subsystem, when decoding a utf-8 string. mkutf8data.c is the source for a program that generates utf8data.h. It was written by Olaf Weber from SGI and originally proposed to be merged into Linux in 2014. The original proposal performed the compatibility decomposition, NFKD, but the current version was modified by me to do canonical decomposition, NFD, as suggested by the community. The changes from the original submission are: * Rebase to mainline. * Fix out-of-tree-build. * Update makefile to build 11.0.0 ucd files. * drop references to xfs. * Convert NFKD to NFD. * Merge back robustness fixes from original patch. Requested by Dave Chinner. The original submission is archived at: <https://linux-xfs.oss.sgi.narkive.com/Xx10wjVY/rfc-unicode-utf-8-support-for-xfs > The utf8data.h file can be regenerated using the instructions in fs/unicode/README.utf8data. - Notes on the update from 8.0.0 to 11.0: The structure of the ucd files and special cases have not experienced any changes between versions 8.0.0 and 11.0.0. 8.0.0 saw the addition of Cherokee LC characters, which is an interesting case for case-folding. The update is accompanied by new tests on the test_ucd module to catch specific cases. No changes to mkutf8data script were required for the updates. Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-