Commit Graph

16 Commits

Author SHA1 Message Date
Andrew Halaney bd062efa75 printk: Wait for all reserved records with pr_flush()
JIRA: https://issues.redhat.com/browse/RHEL-3987
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/

commit 46c6d072f618fc70fb1e629ab3acfe7e06dcb8b4
Author: John Ogness <john.ogness@linutronix.de>
Date:   Mon Nov 6 14:59:55 2023 +0000

    printk: Wait for all reserved records with pr_flush()

    Currently pr_flush() will only wait for records that were
    available to readers at the time of the call (using
    prb_next_seq()). But there may be more records (non-finalized)
    that have following finalized records. pr_flush() should wait
    for these to print as well. Particularly because any trailing
    finalized records may be the messages that the calling context
    wants to ensure are printed.

    Add a new ringbuffer function prb_next_reserve_seq() to return
    the sequence number following the most recently reserved record.
    This guarantees that pr_flush() will wait until all current
    printk() messages (completed or in progress) have been printed.

    Fixes: 3b604ca81202 ("printk: add pr_flush()")
    Signed-off-by: John Ogness <john.ogness@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
2024-05-09 11:26:23 -04:00
Andrew Halaney 3c0b2aaf9f printk: ringbuffer: Clarify special lpos values
JIRA: https://issues.redhat.com/browse/RHEL-3987
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/

commit 70596fc48fbc159ec34ce9339df56ad352e09273
Author: John Ogness <john.ogness@linutronix.de>
Date:   Mon Oct 23 11:11:05 2023 +0000

    printk: ringbuffer: Clarify special lpos values

    For empty line records, no data blocks are created. Instead,
    these valid records are identified by special logical position
    values (in fields of @prb_desc.text_blk_lpos).

    Currently the macro NO_LPOS is used for empty line records.
    This name is confusing because it does not imply _why_ there is
    no data block.

    Rename NO_LPOS to EMPTY_LINE_LPOS so that it is clear why there
    is no data block.

    Also add comments explaining the use of EMPTY_LINE_LPOS as well
    as clarification to the values used to represent data-less
    blocks.

    Signed-off-by: John Ogness <john.ogness@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
2024-05-09 11:26:23 -04:00
Andrew Halaney 3986afd83f printk: ringbuffer: Do not skip non-finalized records with prb_next_seq()
JIRA: https://issues.redhat.com/browse/RHEL-3987
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/

commit bc9d899cfe17946362bdf24ae8e21442615fb976
Author: John Ogness <john.ogness@linutronix.de>
Date:   Thu Oct 19 10:32:05 2023 +0000

    printk: ringbuffer: Do not skip non-finalized records with prb_next_seq()

    Commit f244b4dc53e5 ("printk: ringbuffer: Improve
    prb_next_seq() performance") introduced an optimization for
    prb_next_seq() by using best-effort to track recently finalized
    records. However, the order of finalization does not
    necessarily match the order of the records. The optimization
    changed prb_next_seq() to return inconsistent results, possibly
    yielding sequence numbers that are not available to readers
    because they are preceded by non-finalized records or they are
    not yet visible to the reader CPU.

    Rather than simply best-effort tracking recently finalized
    records, force the committing writer to read records and
    increment the last "contiguous block" of finalized records. In
    order to do this, the sequence number instead of ID must be
    stored because ID's cannot be directly compared.

    A new memory barrier pair is introduced to guarantee that a
    reader can always read the records up until the sequence number
    returned by prb_next_seq() (unless the records have since
    been overwritten in the ringbuffer).

    This restores the original functionality of prb_next_seq()
    while also keeping the optimization.

    For 32bit systems, only the lower 32 bits of the sequence
    number are stored. When reading the value, it is expanded to
    the full 64bit sequence number using the 32bit seq macros,
    which fold in the value returned by prb_first_seq().

    Fixes: f244b4dc53e5 ("printk: ringbuffer: Improve prb_next_seq() performance")
    Signed-off-by: John Ogness <john.ogness@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
2024-05-09 11:26:23 -04:00
Andrew Halaney 5814676b11 printk: Use prb_first_seq() as base for 32bit seq macros
JIRA: https://issues.redhat.com/browse/RHEL-3987
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/

commit fcfa743b20e41c8a1cef0612900f0f1677f8a87b
Author: John Ogness <john.ogness@linutronix.de>
Date:   Wed Nov 22 16:13:37 2023 +0000

    printk: Use prb_first_seq() as base for 32bit seq macros

    Note: This change only applies to 32bit architectures. On 64bit
          architectures the macros are NOPs.

    Currently prb_next_seq() is used as the base for the 32bit seq
    macros __u64seq_to_ulseq() and __ulseq_to_u64seq(). However, in
    a follow-up commit, prb_next_seq() will need to make use of the
    32bit seq macros.

    Use prb_first_seq() as the base for the 32bit seq macros instead
    because it is guaranteed to return 64bit sequence numbers without
    relying on any 32bit seq macros.

    Signed-off-by: John Ogness <john.ogness@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
2024-05-09 11:26:23 -04:00
Andrew Halaney 7e7b42b26f printk: Adjust mapping for 32bit seq macros
JIRA: https://issues.redhat.com/browse/RHEL-3987
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/

commit 3fcd8258460134ae4a79cbb24a39efeeecef6206
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:   Thu Dec 7 14:15:15 2023 +0000

    printk: Adjust mapping for 32bit seq macros

    Note: This change only applies to 32bit architectures. On 64bit
          architectures the macros are NOPs.

    __ulseq_to_u64seq() computes the upper 32 bits of the passed
    argument value (@ulseq). The upper bits are derived from a base
    value (@rb_next_seq) in a way that assumes @ulseq represents a
    64bit number that is less than or equal to @rb_next_seq.

    Until now this mapping has been correct for all call sites. However,
    in a follow-up commit, values of @ulseq will be passed in that are
    higher than the base value. This requires a change to how the 32bit
    value is mapped to a 64bit sequence number.

    Rather than mapping @ulseq such that the base value is the end of a
    32bit block, map @ulseq such that the base value is in the middle of
    a 32bit block. This allows supporting 31 bits before and after the
    base value, which is deemed acceptable for the console sequence
    number during runtime.

    Here is an example to illustrate the previous and new mappings.

    For a base value (@rb_next_seq) of 2 2000 0000...

    Before this change the range of possible return values was:

    1 2000 0001 to 2 2000 0000

    __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff
    __ulseq_to_u64seq(2000 0000) => 2 2000 0000
    __ulseq_to_u64seq(2000 0001) => 1 2000 0001
    __ulseq_to_u64seq(9fff ffff) => 1 9fff ffff
    __ulseq_to_u64seq(a000 0000) => 1 a000 0000
    __ulseq_to_u64seq(a000 0001) => 1 a000 0001

    After this change the range of possible return values are:
    1 a000 0001 to 2 a000 0000

    __ulseq_to_u64seq(1fff ffff) => 2 1fff ffff
    __ulseq_to_u64seq(2000 0000) => 2 2000 0000
    __ulseq_to_u64seq(2000 0001) => 2 2000 0001
    __ulseq_to_u64seq(9fff ffff) => 2 9fff ffff
    __ulseq_to_u64seq(a000 0000) => 2 a000 0000
    __ulseq_to_u64seq(a000 0001) => 1 a000 0001

    [ john.ogness: Rewrite commit message. ]

    Reported-by: Francesco Dolcini <francesco@dolcini.it>
    Reported-by: kernel test robot <oliver.sang@intel.com>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: John Ogness <john.ogness@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
2024-05-09 11:26:23 -04:00
Andrew Halaney a2c328b13f printk: nbcon: Relocate 32bit seq macros
JIRA: https://issues.redhat.com/browse/RHEL-3987
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/

commit e55641cf1214f2888a8676c3d0f74a17c080a8bf
Author: John Ogness <john.ogness@linutronix.de>
Date:   Wed Dec 6 12:01:56 2023 +0000

    printk: nbcon: Relocate 32bit seq macros

    The macros __seq_to_nbcon_seq() and __nbcon_seq_to_seq() are
    used to provide support for atomic handling of sequence numbers
    on 32bit systems. Until now this was only used by nbcon.c,
    which is why they were located in nbcon.c and include nbcon in
    the name.

    In a follow-up commit this functionality is also needed by
    printk_ringbuffer. Rather than duplicating the functionality,
    relocate the macros to printk_ringbuffer.h.

    Also, since the macros will be no longer nbcon-specific, rename
    them to __u64seq_to_ulseq() and __ulseq_to_u64seq().

    This does not result in any functional change.

    Signed-off-by: John Ogness <john.ogness@linutronix.de>
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
2024-05-09 11:26:23 -04:00
David Arcari 6355e01c81 printk: ringbuffer: Improve prb_next_seq() performance
Bugzilla: https://bugzilla.redhat.com/2117494

commit f244b4dc53e520d4570b2610436aba0593ce6f55
Author: Petr Mladek <pmladek@suse.com>
Date:   Fri Jan 21 18:36:28 2022 +0530

    printk: ringbuffer: Improve prb_next_seq() performance

    prb_next_seq() always iterates from the first known sequence number.
    In the worst case, it might loop 8k times for 256kB buffer,
    15k times for 512kB buffer, and 64k times for 2MB buffer.

    It was reported that polling and reading using syslog interface
    might occupy 50% of CPU.

    Speedup the search by storing @id of the last finalized descriptor.

    The loop is still needed because the @id is stored and read in the best
    effort way. An atomic variable is used to keep the @id consistent.
    But the stores and reads are not serialized against each other.
    The descriptor could get reused in the meantime. The related sequence
    number will be used only when it is still valid.

    An invalid value should be read _only_ when there is a flood of messages
    and the ringbuffer is rapidly reused. The performance is the least
    problem in this case.

    Reported-by: Chunlei Wang <chunlei.wang@mediatek.com>
    Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
    Reviewed-by: John Ogness <john.ogness@linutronix.de>
    Signed-off-by: Petr Mladek <pmladek@suse.com>
    Link: https://lore.kernel.org/r/1642770388-17327-1-git-send-email-quic_mojha@quicinc.com
    Link: https://lore.kernel.org/lkml/YXlddJxLh77DKfIO@alley/T/#m43062e8b2a17f8dbc8c6ccdb8851fb0dbaabbb14

Signed-off-by: David Arcari <darcari@redhat.com>
2022-09-15 08:47:32 -04:00
Lukas Bulwahn 9bc284ca0b printk: rectify kernel-doc for prb_rec_init_wr()
The command 'find ./kernel/printk/ | xargs ./scripts/kernel-doc -none'
reported a mismatch with the kernel-doc of prb_rec_init_wr().

Rectify the kernel-doc, such that no issues remain for ./kernel/printk/.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210125081748.19903-1-lukas.bulwahn@gmail.com
2021-01-26 11:17:51 +01:00
John Ogness 59f8bcca1e printk: avoid and/or handle record truncation
If a reader provides a buffer that is smaller than the message text,
the @text_len field of @info will have a value larger than the buffer
size. If readers blindly read @text_len bytes of data without
checking the size, they will read beyond their buffer.

Add this check to record_print_text() to properly recognize when such
truncation has occurred.

Add a maximum size argument to the ringbuffer function to extend
records so that records can not be created that are larger than the
buffer size of readers.

When extending records (LOG_CONT), do not extend records beyond
LOG_LINE_MAX since that is the maximum size available in the buffers
used by consoles and syslog.

Fixes: f5f022e53b ("printk: reimplement log_cont using record extension")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200930090134.8723-2-john.ogness@linutronix.de
2020-09-30 13:30:28 +02:00
John Ogness f35efc78ad printk: remove dict ring
Since there is no code that will ever store anything into the dict
ring, remove it. If any future dictionary properties are to be
added, these should be added to the struct printk_info.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200918223421.21621-4-john.ogness@linutronix.de
2020-09-22 11:39:18 +02:00
John Ogness 74caba7f2a printk: move dictionary keys to dev_printk_info
Dictionaries are only used for SUBSYSTEM and DEVICE properties. The
current implementation stores the property names each time they are
used. This requires more space than otherwise necessary. Also,
because the dictionary entries are currently considered optional,
it cannot be relied upon that they are always available, even if the
writer wanted to store them. These issues will increase should new
dictionary properties be introduced.

Rather than storing the subsystem and device properties in the
dict ring, introduce a struct dev_printk_info with separate fields
to store only the property values. Embed this struct within the
struct printk_info to provide guaranteed availability.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/87mu1jl6ne.fsf@jogness.linutronix.de
2020-09-22 11:27:48 +02:00
John Ogness cfe2790b16 printk: move printk_info into separate array
The majority of the size of a descriptor is taken up by meta data,
which is often not of interest to the ringbuffer (for example,
when performing state checks). Since descriptors are often
temporarily stored on the stack, keeping their size minimal will
help reduce stack pressure.

Rather than embedding the printk_info into the descriptor, create
a separate printk_info array. The index of a descriptor in the
descriptor array corresponds to the printk_info with the same
index in the printk_info array. The rules for validity of a
printk_info match the existing rules for the data blocks: the
descriptor must be in a consistent state.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200918223421.21621-2-john.ogness@linutronix.de
2020-09-22 11:09:42 +02:00
John Ogness 4cfc7258f8 printk: ringbuffer: add finalization/extension support
Add support for extending the newest data block. For this, introduce
a new finalization state (desc_finalized) denoting a committed
descriptor that cannot be extended.

Until a record is finalized, a writer can reopen that record to
append new data. Reopening a record means transitioning from the
desc_committed state back to the desc_reserved state.

A writer can explicitly finalize a record if there is no intention
of extending it. Also, records are automatically finalized when a
new record is reserved. This relieves writers of needing to
explicitly finalize while also making such records available to
readers sooner. (Readers can only traverse finalized records.)

Four new memory barrier pairs are introduced. Two of them are
insignificant additions (data_realloc:A/desc_read:D and
data_realloc:A/data_push_tail:B) because they are alternate path
memory barriers that exactly match the purpose, pairing, and
context of the two existing memory barrier pairs they provide an
alternate path for. The other two new memory barrier pairs are
significant additions:

desc_reopen_last:A / _prb_commit:B - When reopening a descriptor,
    ensure the state transitions back to desc_reserved before
    fully trusting the descriptor data.

_prb_commit:B / desc_reserve:D - When committing a descriptor,
    ensure the state transitions to desc_committed before checking
    the head ID to see if the descriptor needs to be finalized.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914123354.832-6-john.ogness@linutronix.de
2020-09-15 16:35:27 +02:00
John Ogness 10dcb06d40 printk: ringbuffer: change representation of states
Rather than deriving the state by evaluating bits within the flags
area of the state variable, assign the states explicit values and
set those values in the flags area. Introduce macros to make it
simple to read and write state values for the state variable.

Although the functionality is preserved, the binary representation
for the states is changed.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200914123354.832-5-john.ogness@linutronix.de
2020-09-15 15:52:49 +02:00
John Ogness d397820f36 printk: ringbuffer: support dataless records
With commit 896fbe20b4 ("printk: use the lockless ringbuffer"),
printk() started silently dropping messages without text because such
records are not supported by the new printk ringbuffer.

Add support for such records.

Currently dataless records are denoted by INVALID_LPOS in order
to recognize failed prb_reserve() calls. Change the ringbuffer
to instead use two different identifiers (FAILED_LPOS and
NO_LPOS) to distinguish between failed prb_reserve() records and
successful dataless records, respectively.

Fixes: 896fbe20b4 ("printk: use the lockless ringbuffer")
Fixes: https://lkml.kernel.org/r/20200718121053.GA691245@elver.google.com
Reported-by: Marco Elver <elver@google.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200721132528.9661-1-john.ogness@linutronix.de
2020-09-08 09:32:59 +02:00
John Ogness b6cf8b3f33 printk: add lockless ringbuffer
Introduce a multi-reader multi-writer lockless ringbuffer for storing
the kernel log messages. Readers and writers may use their API from
any context (including scheduler and NMI). This ringbuffer will make
it possible to decouple printk() callers from any context, locking,
or console constraints. It also makes it possible for readers to have
full access to the ringbuffer contents at any time and context (for
example from any panic situation).

The printk_ringbuffer is made up of 3 internal ringbuffers:

desc_ring:
A ring of descriptors. A descriptor contains all record meta data
(sequence number, timestamp, loglevel, etc.) as well as internal state
information about the record and logical positions specifying where in
the other ringbuffers the text and dictionary strings are located.

text_data_ring:
A ring of data blocks. A data block consists of an unsigned long
integer (ID) that maps to a desc_ring index followed by the text
string of the record.

dict_data_ring:
A ring of data blocks. A data block consists of an unsigned long
integer (ID) that maps to a desc_ring index followed by the dictionary
string of the record.

The internal state information of a descriptor is the key element to
allow readers and writers to locklessly synchronize access to the data.

Co-developed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20200709132344.760-3-john.ogness@linutronix.de
2020-07-10 08:48:19 +02:00