The BODY_MAYBE state really describes the "we are in a declaration" state.
Rename it accordingly, and split the handling of this state out from that
of the other BODY* states. This change introduces a fair amount of
duplicated code that will be coalesced in a later patch.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250621203512.223189-4-corbet@lwn.net
Pull the repeated "begin a section" logic into a single place and hide it
within the KernelEntry class.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250621203512.223189-3-corbet@lwn.net
The regex in the BODY_WITH_BLANK_LINE case was looking for lines starting
with " * ", where exactly one space was allowed before the following text.
There are many kerneldoc comments where the authors have put multiple
spaces instead, leading to mis-formatting of the documentation.
Specifically, in this case, the description portion is associated with the
last of the parameters.
Allow multiple spaces in this context.
See, for example, synchronize_hardirq() and how its documentation is
formatted before and after the change.
Acked-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Tested-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250621203512.223189-2-corbet@lwn.net
Add some comments to process_name() to cover its broad phases of operation,
and slightly restructure the if/then/else structure to remove some early
returns.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250606163438.229916-10-corbet@lwn.net
Move two complex regexes up with the other patterns, decluttering this
function and allowing the compilation to be done once rather than for every
kerneldoc comment.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250606163438.229916-9-corbet@lwn.net
The code testing for a pointer declaration in process_name() has no actual
effect on subsequent actions; remove it.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250606163438.229916-8-corbet@lwn.net
The entry.descr value used in process_name() is not actually a member of
the KernelEntry class; it is a bit of local state. So just manage it
locally.
A trim_whitespace() helper was added to clean up the code slightly.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250606163438.229916-7-corbet@lwn.net
entry::is_kernel_comment never had anything to do with the entry itself; it
is a bit of local state in one branch of process_name(). It can, in fact,
be removed entirely; rework the code slightly so that it is no longer
needed.
No change in the rendered output.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250606163438.229916-6-corbet@lwn.net
process_name() looks for the first line of a kerneldoc comment. It
contains two nearly identical regular expressions, the second of which only
catches six cases in the kernel, all of the form:
define SOME_MACRO_NAME - description
Simply put the "define" into the regex and discard it, eliminating the loop
and the code to remove it specially.
Note that this still treats these defines as if they were functions, but
that's a separate issue.
There is no change in the generated output.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250606163438.229916-5-corbet@lwn.net
It is only used in one place, so just put the constant string
"Introduction" there so people don't have to go looking for it.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250606163438.229916-4-corbet@lwn.net
Since all of the handlers already nicely have the same prototype, put them
into a table and call them from there and take out the extended
if-then-else series.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250606163438.229916-3-corbet@lwn.net
The support for dropping "_noprof" missed dropping the suffix from
exported symbols. That meant that using the :export: feature would
look for kernel-doc for (eg) krealloc_noprof() and not find the
kernel-doc for krealloc().
Fixes: 51a7bf0238 (scripts/kernel-doc: drop "_noprof" on function prototypes)
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250606141543.1285671-1-willy@infradead.org
If a file sent to KernelFiles.msg() method doesn't exist, instead
of producing a KeyError, output an error message.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Closes: https://lore.kernel.org/linux-doc/cover.1747719873.git.mchehab+huawei@kernel.org/T/#ma43ae9d8d0995b535cf5099e5381dace0410de04
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <4efa177f2157a7ec009cc197dfc2d87e6f32b165.1747817887.git.mchehab+huawei@kernel.org>
The KernelDoc class is too complex. Start optimizing it by
placing the kernel-doc parser entry to a separate class.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <28b456f726a022011f0ce5810dbcc26827c1403a.1745564565.git.mchehab+huawei@kernel.org>
The script library here contain just classes. Remove execution
permission.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <be0b0a5bde82fa09027a5083f8202f150581eb4e.1745564565.git.mchehab+huawei@kernel.org>
Typedefs like
typedef struct phylink_pcs *(*pcs_xlate_t)(const u64 *args);
have a typedef_type that ends with a * and therefore has no word
boundary. Add an extra clause for the final group of the typedef_type so
we only require a word boundary if we match a word.
[mchehab: modify also kernel-doc.py, as we're deprecating the perl version]
Fixes: 7d2c6b1edf ("scripts: kernel-doc: fix parsing function-like typedefs")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/e0abb103c73a96d76602d909f60ab8fd6e2fd0bd.1744106242.git.mchehab+huawei@kernel.org
Change the logic which detects internal/external symbols in a way
that we can re-use it when calling via Sphinx extension.
While here, remove an unused self.config var and let it clearer
that self.config variables are read-only. This helps to allow
handling multiple times in parallel if ever needed.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/6a69ba8d2b7ee6a6427abb53e60d09bd4d3565ee.1744106242.git.mchehab+huawei@kernel.org
The filtering logic was seeking for the DOC name to check for
symbols, but such data is stored only inside a section. Add it
to the output_declaration, as it is quicker/easier to check
the declaration name than to check inside each section.
While here, make sure that the output for both ReST and man
after filtering will be similar to what kernel-doc Perl
version does.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/6d8b77af85295452c0191863ea1041f4195aeaaf.1744106242.git.mchehab+huawei@kernel.org
With the Pyhton version, the actual output happens after parsing,
from records stored at self.entries.
Ensure that line numbers will be properly stored there and
that they'll produce the desired results at the ReST output.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/5182a531d14b5fe9e1fc5da5f9dae05d66852a60.1744106242.git.mchehab+huawei@kernel.org
Instead of setting file lists at __init__ time, move it to
the actual parsing function. This allows adding more files
to be parsed in real time, by calling parse function multiple
times.
With the new way, the export_files logic was rewritten to
avoid parsing twice EXPORT_SYMBOL for partial matches.
Please notice that, with this logic, it can still read the
same file twice when export_file is used.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/ab10bc94050406ce6536d4944b5d718ecd70812f.1744106242.git.mchehab+huawei@kernel.org
The KernelFiles class is the main dispatcher which parses each
source file.
In preparation for letting kerneldoc Sphinx extension to import
Python libraries, move regex ancillary classes to a separate
file.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/80bc855e128a9ff0a11df5afe9ba71775dfc9a0f.1744106241.git.mchehab+huawei@kernel.org
The ABI documentation looks a little bit better if it starts
with the contents of the README is placed at the beginning.
Move it to the beginning of the ABI chapter. While here, improve
the README text and change the title that will be shown at the
html/pdf output to be coherent with both ABI file contents and
with the generated documentation output.
Suggested-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/20250211055809.1898623-1-mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The undefined logic is complex and has lots of magic on it.
Implement it, using the same algorithm we have at get_abi.pl. Yet,
some tweaks to optimize performance and to make the code simpler
were added here:
- at the perl version, the tree graph had loops, so we had to
use BFS to traverse it. On this version, the graph is a tree,
so, it simplifies the what group for sysfs aliases;
- the logic which splits regular expressions into subgroups
was re-written to make it faster;
- it may optionally use multiple processes to search for symbol
matches;
- it has some additional debug levels.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/1529c255845d117696d5af57d8dc05554663afdf.1739182025.git.mchehab+huawei@kernel.org
Despite being introduced on Python 3.6, the original implementation
was too limited: it doesn't accept anything but the argument.
Even on python 3.10.12, support was still limited, as more complex
operations cause SyntaxError:
Exception occurred:
File ".../linux/Documentation/sphinx/kernel_abi.py", line 48, in <module>
from get_abi import AbiParser
File ".../linux/scripts/lib/abi/abi_parser.py", line 525
msg += f"{part}\n{"-" * len(part)}\n\n"
^
SyntaxError: f-string: expecting '}'
Replace f-strings by normal string concatenation when it doesn't
work on Python 3.6.
Reported-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/41d2f85df134a46db46fed73a0f9697a3d2ae9ba.1739182025.git.mchehab+huawei@kernel.org
Now that all ABI files are handled together, we can add a feature
at automarkup for it to generate cross-references for ABI symbols.
The cross-reference logic can produce references for all existing
files, except for README (as this is not parsed).
For symbols, they need to be an exact match of what it is
described at the docs, which is not always true due to wildcards.
If symbols at /sys /proc and /config are identical, a cross-reference
will be used.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/0b97a51b68b1c20127ad4a6a55658557fe0848d0.1739182025.git.mchehab+huawei@kernel.org
Right now, the logic parses ABI files on 4 steps, one for each
directory. While this is fine in principle, by doing that, not
all symbol cross-references will be created.
Change the logic to do the parsing only once in order to get
a global dictionary to be used when creating ABI cross-references.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/5205c53838b6ea25f4cdd4cc1e3d17c0141e75a6.1739182025.git.mchehab+huawei@kernel.org
The Documentation/ABI/README file is currently outside the
documentation tree. Yet, it may still provide some useful
information. Add it to the documentation parsing.
As a plus, this avoids a warning when detecting missing
cross-references.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/f1285dedfe4d0eb0f0af34f6a68bee6fde36dd7d.1739182025.git.mchehab+huawei@kernel.org
Instead of producing a big message with all ABI contents and then
parse as a whole, simplify the code by handling each ABI symbol
in separate. As an additional benefit, there's no need to place
file/line nubers inlined at the data and use a regex to convert
them.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/15be22955e3c6df49d7256c8fd24f62b397ad0ff.1739182025.git.mchehab+huawei@kernel.org
Instead of running get_abi.py script, import AbiParser class and
handle messages directly there using an interactor. This shold save some
memory, as there's no need to exec python inside the Sphinx python
extension.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/8dbc244dcda97112c1b694e2512a5d600e62873b.1739182025.git.mchehab+huawei@kernel.org
Instead of printing all results line per line, use an interactor
to return each variable as a separate message.
This won't change much when using it via command line, but it
will help Sphinx integration by providing an interactor that
could be used there to handle ABI symbol by symbol.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/e3c94b8cdfd5e955aa19a703921f364a89089634.1739182025.git.mchehab+huawei@kernel.org
The get_abi.pl script is requiring some care, but it seems that
the number of changes on it since when I originally wrote it
was not too high.
Maintaining perl scripts without using classes requires a higher
efforted than on python, due to global variables management.
Also, it sounds easier to find python developer those days than
perl ones.
As a plus, using a Python class to handle ABI allows a better
integration with Sphinx extensions, allowing, for instance,
to let automarkup to generate cross-references for ABI
symbols.
With that in mind, rewrite the core of get_abi.pl in Python,
using classes, to help producing documentation. This will
allow a better integration in the future with the Sphinx
ABI extension.
The algorithms used there are the same as the ones in Perl,
with a couple of cleanups to remove redundant variables and
to help with cross-reference generation. While doing that,
remove some code that were important in the past, where
ABI files weren't using ReST format.
Some minor improvements were added like using a fixed seed
when generating ABI keys for duplicated names, making its
results reproductible.
The end script is a little bit faster than the original one
(tested on a machine with ssd disks). That's probably because
we're now using only pre-compiled regular expressions, and it
is using string replacement methods instead of regex where
possible.
The new version is a little bit more conservative when
converting text to cross-references to avoid adding them into
literal blocks.
To ensure that the ReST output is parsing all variables
and files properly, the end result was compared using diff
with the one produced by the perl script and showed no regressions.
There are minor improvements at the results, as it now
properly groups What on some special cases. It also better
escape some XREF names.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/71a894211a8b69664711144d9c4f8a0e73d1ae3c.1739182025.git.mchehab+huawei@kernel.org