Commit Graph

2067 Commits

Author SHA1 Message Date
James Smart
982fc3965d scsi: lpfc: Don't release final kref on Fport node while ABTS outstanding
In a rarely executed path, FLOGI failure, there is a refcounting error.  If
FLOGI completed with an error, typically a timeout, the initial completion
handler would remove the job reference. However, the job completion isn't
the actual end of the job/exchange as the timeout usually initiates an
ABTS, and upon that ABTS completion, a final completion is sent. The driver
removes the reference again in the final completion. Thus the imbalance.

In the buggy cases, if there was a link bounce while the delayed response
is outstanding, the fport node may be referenced again but there was no
additional reference as it is already present. The delayed completion then
occurs and removes the last reference freeing the node and causing issues
in the link up processed that is using the node.

Fix this scenario by removing the snippet that removed the reference in the
initial FLOGI completion. The bad snippet was poorly trying to identify the
FLOGI as OK to do so by realizing the node was not registered with either
SCSI or NVMe transport.

Link: https://lore.kernel.org/r/20210910233159.115896-3-jsmart2021@gmail.com
Fixes: 618e2ee146 ("scsi: lpfc: Fix FLOGI failure due to accessing a freed node")
Cc: <stable@vger.kernel.org> # v5.13+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-14 23:33:20 -04:00
James Smart
99154581b0 scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq()
When parsing the txq list in lpfc_drain_txq(), the driver attempts to pass
the requests to the adapter. If such an attempt fails, a local "fail_msg"
string is set and a log message output.  The job is then added to a
completions list for cancellation.

Processing of any further jobs from the txq list continues, but since
"fail_msg" remains set, jobs are added to the completions list regardless
of whether a wqe was passed to the adapter.  If successfully added to
txcmplq, jobs are added to both lists resulting in list corruption.

Fix by clearing the fail_msg string after adding a job to the completions
list. This stops the subsequent jobs from being added to the completions
list unless they had an appropriate failure.

Link: https://lore.kernel.org/r/20210910233159.115896-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-14 23:33:20 -04:00
Chi Minghao
5d1e15108b scsi: lpfc: Remove unneeded variable
Fix the following coccicheck REVIEW:

./drivers/scsi/lpfc/lpfc_scsi.c:1498:9-12 REVIEW Unneeded variable

Link: https://lore.kernel.org/r/20210831114058.17817-1-lv.ruyi@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cm>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Chi Minghao <chi.minghao@zte.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 22:15:42 -04:00
James Smart
37e384095f scsi: lpfc: Fix compilation errors on kernels with no CONFIG_DEBUG_FS
The Kernel test robot flagged the following warning:

  ".../lpfc_init.c:7788:35: error: 'struct lpfc_sli4_hba' has no member
   named 'c_stat'"

Reviewing this issue highlighted that one of the recent patches caused the
driver to no longer compile cleanly if CONFIG_DEBUG_FS is not set.

Correct the different areas that are failing to compile.

Link: https://lore.kernel.org/r/20210908050927.37275-1-jsmart2021@gmail.com
Fixes: 02243836ad ("scsi: lpfc: Add support for the CM framework")
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Build-tested-by: Nathan Chancellor <nathan@kernel.org>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 22:15:41 -04:00
James Smart
59936430e6 scsi: lpfc: Fix CPU to/from endian warnings introduced by ELS processing
The kernel test robot reported the following sparse warning:
".../lpfc_els.c:3984:25: sparse: sparse: cast from restricted __be16"

For the error being flagged, using be32_to_cpu() on a be16 data type, it
was simple enough. But a review of other elements and warnings were also
evaluated.

This patch corrected several items in the original patch:

 - Using be32_to_cpu() on a be16 data type

 - cpu_to_le32() used on a std uint32_t (CPU) data type.

   Note: This is a byte array, but stored in LE layout by hardware at
   32-bit boundaries. So it possibly needed conversion.

 - Using cpu_to_le32() on a std uint16_t and assigned to a char typeA

 - Using le32_to_cpu() on a le16 type

 - Missing cpu_to_le16() on an assignment

Link: https://lore.kernel.org/r/20210830231243.6227-1-jsmart2021@gmail.com
Fixes: 9064aeb2df ("scsi: lpfc: Add EDC ELS support")
Reported-by: kernel test robot <lkp@intel.com>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 22:15:40 -04:00
Linus Torvalds
a9c9a6f741 SCSI misc on 20210902
This series consists of the usual driver updates (ufs, qla2xxx,
 target, smartpqi, lpfc, mpt3sas).  The core change causing the most
 churn was replacing the command request field request with a macro,
 allowing us to offset map to it and remove the redundant field; the
 same was also done for the tag field.  The most impactful change is
 the final removal of scsi_ioctl, which has been deprecated for over a
 decade.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYTD/TiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdUkAQCjb3Ux
 4K9438mMelHlzM4er1S1IJ0WNnvObaVMNO9LBwD+JUz+rHsrKvuEX9j3g3C3u6JH
 hC3BUEW8f2LLnujWanQ=
 =lC5o
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This series consists of the usual driver updates (ufs, qla2xxx,
  target, smartpqi, lpfc, mpt3sas).

  The core change causing the most churn was replacing the command
  request field request with a macro, allowing us to offset map to it
  and remove the redundant field; the same was also done for the tag
  field.

  The most impactful change is the final removal of scsi_ioctl, which
  has been deprecated for over a decade"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (293 commits)
  scsi: ufs: Fix ufshcd_request_sense_async() for Samsung KLUFG8RHDA-B2D1
  scsi: ufs: ufs-exynos: Fix static checker warning
  scsi: mpt3sas: Use the proper SCSI midlayer interfaces for PI
  scsi: lpfc: Use the proper SCSI midlayer interfaces for PI
  scsi: lpfc: Copyright updates for 14.0.0.1 patches
  scsi: lpfc: Update lpfc version to 14.0.0.1
  scsi: lpfc: Add bsg support for retrieving adapter cmf data
  scsi: lpfc: Add cmf_info sysfs entry
  scsi: lpfc: Add debugfs support for cm framework buffers
  scsi: lpfc: Add support for maintaining the cm statistics buffer
  scsi: lpfc: Add rx monitoring statistics
  scsi: lpfc: Add support for the CM framework
  scsi: lpfc: Add cmfsync WQE support
  scsi: lpfc: Add support for cm enablement buffer
  scsi: lpfc: Add cm statistics buffer support
  scsi: lpfc: Add EDC ELS support
  scsi: lpfc: Expand FPIN and RDF receive logging
  scsi: lpfc: Add MIB feature enablement support
  scsi: lpfc: Add SET_HOST_DATA mbox cmd to pass date/time info to firmware
  scsi: fc: Add EDC ELS definition
  ...
2021-09-02 15:09:46 -07:00
Martin K. Petersen
125c12f717 scsi: lpfc: Use the proper SCSI midlayer interfaces for PI
Use the SCSI midlayer interfaces to query protection interval, reference
tag, per-command DIX flags, and logical block count.

Link: https://lore.kernel.org/r/20210817025014.12085-3-martin.petersen@oracle.com
CC: James Smart <james.smart@broadcom.com>
CC: Dick Kennedy <dick.kennedy@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 23:10:14 -04:00
James Smart
9eb636b639 scsi: lpfc: Copyright updates for 14.0.0.1 patches
Update copyrights to 2021 for files modified in the 14.0.0.1 patch set.

Link: https://lore.kernel.org/r/20210816162901.121235-17-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:34 -04:00
James Smart
2dbf7cde53 scsi: lpfc: Update lpfc version to 14.0.0.1
Update lpfc version to 14.0.0.1

Link: https://lore.kernel.org/r/20210816162901.121235-16-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:34 -04:00
James Smart
acbaa8c8ed scsi: lpfc: Add bsg support for retrieving adapter cmf data
Add a bsg ioctl to allow user applications to retrieve the adapter
congestion management framework buffer.

Link: https://lore.kernel.org/r/20210816162901.121235-15-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:34 -04:00
James Smart
74a7baa2a3 scsi: lpfc: Add cmf_info sysfs entry
Allow abbreviated cm framework status information to be obtained via sysfs.

Link: https://lore.kernel.org/r/20210816162901.121235-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:34 -04:00
James Smart
9f77870870 scsi: lpfc: Add debugfs support for cm framework buffers
Add support via debugfs to report the cm statistics, cm enablement, and rx
monitor information.

Link: https://lore.kernel.org/r/20210816162901.121235-13-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:34 -04:00
James Smart
7481811c3a scsi: lpfc: Add support for maintaining the cm statistics buffer
Add the logic to move the congestion management and event information into
the cmd statistics buffer maintained for the adapter.  The update includes
rolling up values for the last minute, hour, and day information.

Link: https://lore.kernel.org/r/20210816162901.121235-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:34 -04:00
James Smart
17b27ac592 scsi: lpfc: Add rx monitoring statistics
The driver provides overwatch of the cm behavior by maintaining a set of rx
I/O statistics. This information is also used in later updating of the cm
statistics buffer.

Link: https://lore.kernel.org/r/20210816162901.121235-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:34 -04:00
James Smart
02243836ad scsi: lpfc: Add support for the CM framework
Complete the enablement of the cm framework feature in the adapter. Perform
the following:

 - Detect the presence of the congestion management framework feature.

When the cm framework is present:

 - Issue the SET_FEATURE command to enable the feature.

 - Register the cm statistics buffer with the adapter.

 - Read the cm enablement buffer to determine the cm framework state for cm
   management.

When cm management is enabled:

 - Monitor all FPIN and congestion signalling events, incrementing
   counters.

 - Regularly sync with the adapter to communicate congestion events and to
   receive an rx request limit.

 - Monitor requests for rx data and ensure that no more than the
   adapter prescribed limit is issued on the link. If the limit is
   exceeded, SCSI and/or NVMe traffic is temporarily suspended.

 - Maintain the minute, hourly, daily statistics buffer.

 - Monitor for congestion enablement change events, causing a reread of the
   enablement buffer and acting on any change in enablement.

And:

 - Add teardown logic, including buffer deregistration, on adapter
   detachment or reset.

Link: https://lore.kernel.org/r/20210816162901.121235-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:34 -04:00
James Smart
daebf93fc3 scsi: lpfc: Add cmfsync WQE support
When congestion mgmt is enabled, cmf has the driver regularly issue a
command to synchronize reporting of congestion mgmt events such as fpin and
signal delivery.

This patch adds the definition of the CMF_SYNC WQE and its CQE fields as
well as support for issuing the command. The patch also adds the few
remaining cmf-related SLI additions, such as feature definition for
enablement of CMF and notifications to the driver if the cm enablement mode
changes.

Link: https://lore.kernel.org/r/20210816162901.121235-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:34 -04:00
James Smart
72df8a4528 scsi: lpfc: Add support for cm enablement buffer
As part of the cmf framework, the firmware maintains a table with
congestion related state information, specifically whether enabled and if
enabled, whether monitoring or actively managing congestion.

Add definition of the table and add support to read the table from the
adapter and determine if it is enabled. In support of this, the READ_OBJECT
mailbox command definition is added to the driver.

Link: https://lore.kernel.org/r/20210816162901.121235-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:33 -04:00
James Smart
8c42a65c39 scsi: lpfc: Add cm statistics buffer support
The cmf framework requires the driver to maintain a cm statistics table,
accessible inband, of congestion related statistics that are reported per
minute, rolled up to per hour, and rolled up again per day. Several days
worth may be maintained.  The table is registered with the adapter when the
MIB feature is enabled.

Add definition of the table and add support to register the table with the
adapter. Includes definition and initialization of event counters that are
later added to the statistics table.

Link: https://lore.kernel.org/r/20210816162901.121235-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:33 -04:00
James Smart
9064aeb2df scsi: lpfc: Add EDC ELS support
When congestion management is enabled, issue EDC ELS to register congestion
signaling capabilities with the fabric. The response handling will process
the fabric parameters and set the reporting parameters.

Similarly, add support for receiving an EDC request from the fabric
generating a corresponding response.

Implement handlers for congestion signals from the fabric and maintain
statistics for them.

Link: https://lore.kernel.org/r/20210816162901.121235-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:33 -04:00
James Smart
428569e66f scsi: lpfc: Expand FPIN and RDF receive logging
Expand FPIN logging:

 - Display Attached Port Names for Link Integrity and Peer Congestion
   events

 - Log Delivery, Peer Congestion, and Congestion events

 - Sanity check FPIN descriptor lengths when processing FPIN descriptors.

Log RDF events when congestion logging is enabled.

Link: https://lore.kernel.org/r/20210816162901.121235-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:33 -04:00
James Smart
c6a5c747a3 scsi: lpfc: Add MIB feature enablement support
MIB support is currently limited to detecting support in the adapter and
ensuring FDMI support is enabled if present.  For the new framework MIB
support also requires active enablement of support via the SET_FEATURES
command with the firmware.

Rework the MIB detection and enablement for the following:

 - Move detection away from the get_sli4_parameters routine, and into the
   hba_setup path. get_sli4_parameters is only called once at attachment
   while hba_setup is called as part of any SLI port reset path. This
   ensures detection after firmware download.

 - Update SET_FEATURES mbx command for the MIB enablement feature and add
   support for the feature.

 - Create the cmf_setup routine to encapsulate the detection of MIB support
   and perform the enablement of the MIB support feature.

Link: https://lore.kernel.org/r/20210816162901.121235-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:33 -04:00
James Smart
3b0009c8be scsi: lpfc: Add SET_HOST_DATA mbox cmd to pass date/time info to firmware
Implement the SET_HOST_DATA mbox command to set date / time during
initialization.  It is used by the firmware for various purposes including
congestion management and firmware dumps.

Link: https://lore.kernel.org/r/20210816162901.121235-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-24 22:56:33 -04:00
Bart Van Assche
4221c8a4bd scsi: lpfc: Use scsi_cmd_to_rq() instead of scsi_cmnd.request
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.

Link: https://lore.kernel.org/r/20210809230355.8186-28-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-11 22:25:39 -04:00
Ewan D. Milne
9977d880f7 scsi: lpfc: Move initialization of phba->poll_list earlier to avoid crash
The phba->poll_list is traversed in case of an error in
lpfc_sli4_hba_setup(), so it must be initialized earlier in case the error
path is taken.

[  490.030738] lpfc 0000:65:00.0: 0:1413 Failed to init iocb list.
[  490.036661] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[  490.044485] PGD 0 P4D 0
[  490.047027] Oops: 0000 [#1] SMP PTI
[  490.050518] CPU: 0 PID: 7 Comm: kworker/0:1 Kdump: loaded Tainted: G          I      --------- -  - 4.18.
[  490.060511] Hardware name: Dell Inc. PowerEdge R440/0WKGTH, BIOS 1.4.8 05/22/2018
[  490.067994] Workqueue: events work_for_cpu_fn
[  490.072371] RIP: 0010:lpfc_sli4_cleanup_poll_list+0x20/0xb0 [lpfc]
[  490.078546] Code: cf e9 04 f7 fe ff 0f 1f 40 00 0f 1f 44 00 00 41 57 49 89 ff 41 56 41 55 41 54 4d 8d a79
[  490.097291] RSP: 0018:ffffbd1a463dbcc8 EFLAGS: 00010246
[  490.102518] RAX: 0000000000008200 RBX: ffff945cdb8c0000 RCX: 0000000000000000
[  490.109649] RDX: 0000000000018200 RSI: ffff9468d0e16818 RDI: 0000000000000000
[  490.116783] RBP: ffff945cdb8c1740 R08: 00000000000015c5 R09: 0000000000000042
[  490.123915] R10: 0000000000000000 R11: ffffbd1a463dbab0 R12: ffff945cdb8c25c0
[  490.131049] R13: 00000000fffffff4 R14: 0000000000001800 R15: ffff945cdb8c0000
[  490.138182] FS:  0000000000000000(0000) GS:ffff9468d0e00000(0000) knlGS:0000000000000000
[  490.146267] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  490.152013] CR2: 0000000000000000 CR3: 000000042ca10002 CR4: 00000000007706f0
[  490.159146] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  490.166277] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  490.173409] PKRU: 55555554
[  490.176123] Call Trace:
[  490.178598]  lpfc_sli4_queue_destroy+0x7f/0x3c0 [lpfc]
[  490.183745]  lpfc_sli4_hba_setup+0x1bc7/0x23e0 [lpfc]
[  490.188797]  ? kernfs_activate+0x63/0x80
[  490.192721]  ? kernfs_add_one+0xe7/0x130
[  490.196647]  ? __kernfs_create_file+0x80/0xb0
[  490.201020]  ? lpfc_pci_probe_one_s4.isra.48+0x46f/0x9e0 [lpfc]
[  490.206944]  lpfc_pci_probe_one_s4.isra.48+0x46f/0x9e0 [lpfc]
[  490.212697]  lpfc_pci_probe_one+0x179/0xb70 [lpfc]
[  490.217492]  local_pci_probe+0x41/0x90
[  490.221246]  work_for_cpu_fn+0x16/0x20
[  490.224994]  process_one_work+0x1a7/0x360
[  490.229009]  ? create_worker+0x1a0/0x1a0
[  490.232933]  worker_thread+0x1cf/0x390
[  490.236687]  ? create_worker+0x1a0/0x1a0
[  490.240612]  kthread+0x116/0x130
[  490.243846]  ? kthread_flush_work_fn+0x10/0x10
[  490.248293]  ret_from_fork+0x35/0x40
[  490.251869] Modules linked in: lpfc(+) xt_CHECKSUM ipt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4i
[  490.332609] CR2: 0000000000000000

Link: https://lore.kernel.org/r/20210809150947.18104-1-emilne@redhat.com
Fixes: 93a4d6f401 ("scsi: lpfc: Add registration for CPU Offline/Online events")
Cc: stable@vger.kernel.org
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-09 22:45:51 -04:00
James Smart
7740b615b6 scsi: lpfc: Fix possible ABBA deadlock in nvmet_xri_aborted()
The lpfc_sli4_nvmet_xri_aborted() routine takes out the abts_buf_list_lock
and traverses the buffer contexts to match the xri. Upon match, it then
takes the context lock before potentially removing the context from the
associated buffer list. This violates the lock hierarchy used elsewhere in
the driver of locking context, then the abts_buf_list_lock - thus a
possible deadlock.

Resolve by: after matching, release the abts_buf_list_lock, then take the
context lock, and if to be deleted from the list, retake the
abts_buf_list_lock, maintaining lock hierarchy. This matches same list lock
hierarchy as elsewhere in the driver

Link: https://lore.kernel.org/r/20210730163309.25809-1-jsmart2021@gmail.com
Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-30 23:47:19 -04:00
Colin Ian King
ff2d86d04d scsi: lpfc: Remove redundant assignment to pointer pcmd
The pointer pcmd is being initialized with a value that is never read, the
assignment is redundant and can be removed.

Link: https://lore.kernel.org/r/20210721095350.41564-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Addresses-Coverity: ("Unused value")
2021-07-27 00:06:41 -04:00
James Smart
45e524d61e scsi: lpfc: Copyright updates for 14.0.0.0 patches
Update copyrights to 2021 for files modified in the 14.0.0.0 patch set.

Link: https://lore.kernel.org/r/20210722221721.74388-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-27 00:06:41 -04:00
James Smart
95518cabe1 scsi: lpfc: Update lpfc version to 14.0.0.0
Update lpfc version to 14.0.0.0.

Link: https://lore.kernel.org/r/20210722221721.74388-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-27 00:06:41 -04:00
James Smart
bfc477854a scsi: lpfc: Add 256 Gb link speed support
Update routines to support 256 Gb link speed for LPe37000/LPe38000
adapters. 256 Gb speeds can be seen on trunk links.

Link: https://lore.kernel.org/r/20210722221721.74388-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-27 00:06:41 -04:00
James Smart
f6c5e6c456 scsi: lpfc: Revise Topology and RAS support checks for new adapters
Support for Topology and RAS logging capabilities were qualified by PCIe
device ID checks necessitating additional driver changes for new device
IDs.

Reduce reliance on specific PCIe device IDs by substituting checks for SLI
family information. This automatically picks up support on the newest
hardware.

Link: https://lore.kernel.org/r/20210722221721.74388-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-27 00:06:41 -04:00
James Smart
df3d78c3eb scsi: lpfc: Fix cq_id truncation in rq create
On the newer hardware, CQ_ID values can be larger than seen on previous
generations. This exposed an issue in the driver where its definition of
cq_id in the RQ Create mailbox cmd was too small, thus the cq_id was
truncated, causing the command to fail.

Revise the RQ_CREATE CQ_ID field to its proper size (16 bits).

Link: https://lore.kernel.org/r/20210722221721.74388-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-27 00:06:40 -04:00
James Smart
f449a3d7a1 scsi: lpfc: Add PCI ID support for LPe37000/LPe38000 series adapters
Update supported pci_device_id table to include the values for the G7+ ASIC
Device ID utilized by LPe37xxx and LPe38xxx series of adapters.  The
default reporting string will be "LPe38000".

Link: https://lore.kernel.org/r/20210722221721.74388-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-27 00:06:40 -04:00
James Smart
f2af8ffc63 scsi: lpfc: Copyright updates for 12.8.0.11 patches
Update copyrights for files modified by the 12.8.0.11 patch set.

Link: https://lore.kernel.org/r/20210707184351.67872-21-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:38 -04:00
James Smart
545a68e711 scsi: lpfc: Update lpfc version to 12.8.0.11
Update lpfc version to 12.8.0.11.

Link: https://lore.kernel.org/r/20210707184351.67872-20-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:37 -04:00
James Smart
ab80386088 scsi: lpfc: Skip issuing ADISC when node is in NPR state
When a node moves to NPR state due to a device recovery event, the
nlp_fc4_types in the node are cleared. An ADISC received for a node in the
NPR state triggers an ADISC. Without fc4 types being known, the calls to
register with the transport are no-op'd, thus no additional references are
placed on the node by transport re-registrations. A subsequent RSCN could
trigger another unregister request, which will decrement the reference
counts, leading to the ref count hitting zero and the node being freed
while futher discovery on the node is being attempted by the RSCN event
handling.

Fix by skipping the trigger of an ADISC when in NPR state. The normal ADISC
process will kick off in the regular discovery path after receiving a
response from name server.

Link: https://lore.kernel.org/r/20210707184351.67872-19-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:37 -04:00
James Smart
02607fbaf0 scsi: lpfc: Skip reg_vpi when link is down for SLI3 in ADISC cmpl path
During RSCN storms, some instances of LIP on SLI-3 adapters lead to a
situation where FLOGIs keep failing with firmware indicating an illegal
command error code.  This situation was preceded by an ADISC completion
that was processed while the link was down. This path on SLI-3 performs a
CLEAR_LA and attempts to activate a VPI with REG_VPI.  Later, as the FLOGI
completes, it's no longer in sync with the VPI state.  In SLI-3 it is
illegal to have an active VPI during FLOGI.

Resolve by circumventing the SLI-3 path that performs the CLEAR_LA and
REG_VPI. The path will be taken after the FLOGI after the next Link Up.

Link: https://lore.kernel.org/r/20210707184351.67872-18-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:37 -04:00
James Smart
c65436b21c scsi: lpfc: Call discovery state machine when handling PLOGI/ADISC completions
In the PLOGI and ADISC completion handling, the device removal event could
be skipped during some link errors. This could leave a stale node in UNUSED
state.  Driver unload would hang for a long time waiting for this node to
be freed.

Resolve by taking the following steps:

 - Always post ADISC completion events to discovery state machine upon
   ADISC completion.

 - In case of a completion error for PLOGI/ADISC, ensure that init refcount
   is dropped if not registered with transport.

Link: https://lore.kernel.org/r/20210707184351.67872-17-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:37 -04:00
James Smart
0614568361 scsi: lpfc: Delay unregistering from transport until GIDFT or ADISC completes
On an RSCN event, the nodes specified in RSCN payload and in MAPPED state
are moved to NPR state in order to revalidate the login. This triggers an
immediate unregister from SCSI/NVMe backend. The assumption is that the
node may be missing. The re-registration with the backend happens after
either relogin (PLOGI/PRLI; if ADISC is disabled or login truly lost) or
when ADISC completes successfully (rediscover with ADISC enabled).

However, the NVMe-FC standard provides for an RSCN to be triggered when
the remote port supports a discovery controller and there was a change
of discovery log content. As the remote port typically also supports
storage subsystems, this unregister causes all storage controller
connections to fail and require reconnect.

Correct by reworking the code to ensure that the unregistration only occurs
when a login state is truly terminated, thereby leaving the NVMe storage
controllers in place.

The changes made are:

 - Retain node state in ADISC_ISSUE when scheduling ADISC ELS retry.

 - Do not clear wwpn/wwnn values upon ADISC failure.

 - Move MAPPED nodes to NPR during RSCN processing, but do not unregister
   with transport.  On GIDFT completion, identify missing nodes (not marked
   NLP_NPR_2B_DISC) and unregister them.

 - Perform unregistration for nodes that will go through ADISC processing
   if ADISC completion fails.

 - Successful ADISC completion will move node back to MAPPED state.

Link: https://lore.kernel.org/r/20210707184351.67872-16-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:37 -04:00
James Smart
816bd88dff scsi: lpfc: Enable adisc discovery after RSCN by default
Assign a default value of 1 to driver module parameter lpfc_use_adisc.

Link: https://lore.kernel.org/r/20210707184351.67872-15-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:37 -04:00
James Smart
137ddf0384 scsi: lpfc: Use PBDE feature enabled bit to determine PBDE support
The SLI4 interface changed the manner used to indicate PBDE support.
Rework the driver to check for PBDE support via the PBDE feature bit in
COMMON_GET_SLI4_PARAMETERS.

Link: https://lore.kernel.org/r/20210707184351.67872-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:36 -04:00
James Smart
a9978e3978 scsi: lpfc: Clear outstanding active mailbox during PCI function reset
Mailbox commands sent via ioctl/bsg from user applications may be
interrupted from processing by a concurrently triggered PCI function
reset. The command will not generate a completion due to the reset.  This
results in a user application hang waiting for the mailbox command to
complete.

Resolve by changing the function reset handler to detect that there was an
outstanding mailbox command and simulate a mailbox completion.  Add some
additional debug when a mailbox command times out.

Link: https://lore.kernel.org/r/20210707184351.67872-13-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:36 -04:00
James Smart
affbe24429 scsi: lpfc: Fix KASAN slab-out-of-bounds in lpfc_unreg_rpi() routine
In lpfc_offline_prep() an RPI is freed and nlp_rpi set to 0xFFFF before
calling lpfc_unreg_rpi().  Unfortunately, lpfc_unreg_rpi() uses nlp_rpi to
index the sli4_hba.rpi_ids[] array.

In lpfc_offline_prep(), unreg rpi before freeing the rpi.

Link: https://lore.kernel.org/r/20210707184351.67872-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:36 -04:00
James Smart
e78c006f4c scsi: lpfc: Remove REG_LOGIN check requirement to issue an ELS RDF
Since the REG_LOGIN to the fabric controller happens in parallel with SCR,
it may or may not be completed by the time RDF is sent.  RDF and SCR are
sent to the fabric in parallel, so checking for a completed REG_LOGIN in
the RDF submit path is not needed.

Remove the REG_LOGI check from the RDF submission path.

Link: https://lore.kernel.org/r/20210707184351.67872-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:36 -04:00
James Smart
cd6047e92c scsi: lpfc: Fix memory leaks in error paths while issuing ELS RDF/SCR request
The ELS job request structure, that is allocated while issuing ELS RDF/SCR
request path, is not being released in an error path causing a memory leak
message on driver unload.

Free the ELS job structure in the error paths.

Link: https://lore.kernel.org/r/20210707184351.67872-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:36 -04:00
James Smart
2d338eb55b scsi: lpfc: Fix NULL ptr dereference with NPIV ports for RDF handling
RDF ELS handling for NPIV ports may result in an incorrect NDLP reference
count.  In the event of a persistent link down, this may lead to premature
release of an NDLP structure and subsequent NULL ptr dereference panic.

Remove extraneous lpfc_nlp_put() call in NPIV port RDF processing.

Link: https://lore.kernel.org/r/20210707184351.67872-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:35 -04:00
James Smart
4e670c8afd scsi: lpfc: Keep NDLP reference until after freeing the IOCB after ELS handling
In the routine that generically cleans up an ELS after completion, the NDLP
put is done prior to the freeing of the IOCB. The IOCB may reference the
NDLP.

Move the lpfc_nlp_put() after freeing the IOCB.

Link: https://lore.kernel.org/r/20210707184351.67872-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:35 -04:00
James Smart
21990d3d18 scsi: lpfc: Fix target reset handler from falsely returning FAILURE
Previous logic accidentally overrides the status variable to FAILURE when
target reset status is SUCCESS.

Refactor the non-SUCCESS logic of lpfc_vmid_vport_cleanup(), which resolves
the false override.

Link: https://lore.kernel.org/r/20210707184351.67872-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:35 -04:00
James Smart
e77803bdbf scsi: lpfc: Discovery state machine fixes for LOGO handling
If a LOGO is received for a node that is in the NPR state, post a DEVICE_RM
event to allow clean up of the logged out node.

Clearing the NLP_NPR_2B_DISC flag upon receipt of a LOGO ACC may cause
skipping of processing outstanding PLOGIs triggered by parallel RSCN
events.  If an outstanding PLOGI is being retried and receipt of a LOGO ACC
occurs, then allow the discovery state machine's PLOGI completion to clean
up the node.

Link: https://lore.kernel.org/r/20210707184351.67872-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:35 -04:00
James Smart
50baa1595d scsi: lpfc: Fix function description comments for vmid routines
Update comment headers for functions lpfc_vmid_cmd and lpfc_vmid_poll.

Link: https://lore.kernel.org/r/20210707184351.67872-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:35 -04:00
James Smart
16a93e83c8 scsi: lpfc: Improve firmware download logging
Define additional status fields in mailbox commands to help provide
additional information when downloading new firmware.

Link: https://lore.kernel.org/r/20210707184351.67872-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:35 -04:00
James Smart
e861308405 scsi: lpfc: Remove use of kmalloc() in trace event logging
There are instances when trace event logs are triggered from an interrupt
context. The trace event log may attempt to alloc memory causing scheduling
while atomic bug call traces.

Remove the need for the kmalloc'ed vport array when checking the
log_verbose flag, which eliminates the need for any allocation.

Link: https://lore.kernel.org/r/20210707184351.67872-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:34 -04:00
James Smart
ae463b6023 scsi: lpfc: Fix NVMe support reporting in log message
The NVMe support indicator in log message 6422 is displaying a field that
was initialized but never set to indicate NVMe support.  Remove obsolete
nvme_support element from the lpfc_hba structure and change log message to
display NVMe support status as reported in SLI4 Config Parameters mailbox
command.

Link: https://lore.kernel.org/r/20210707184351.67872-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-07-18 22:30:34 -04:00
Linus Torvalds
f5c13f1fde Driver core changes for 5.14-rc1
Here is the small set of driver core and debugfs updates for 5.14-rc1.
 
 Included in here are:
 	- debugfs api cleanups (touched some drivers)
 	- devres updates
 	- tiny driver core updates and tweaks
 
 Nothing major in here at all, and all have been in linux-next for a
 while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYOM7jA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yloDQCfZOlLYXF+2KgXJQqevNnRiu7/B1gAn3aCX6xh
 UWVUfu5LDIXi2uFERRT1
 =Ze3R
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core changes from Greg KH:
 "Here is the small set of driver core and debugfs updates for 5.14-rc1.

  Included in here are:

   - debugfs api cleanups (touched some drivers)

   - devres updates

   - tiny driver core updates and tweaks

  Nothing major in here at all, and all have been in linux-next for a
  while with no reported issues"

* tag 'driver-core-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (27 commits)
  docs: ABI: testing: sysfs-firmware-memmap: add some memmap types.
  devres: Enable trace events
  devres: No need to call remove_nodes() when there none present
  devres: Use list_for_each_safe_from() in remove_nodes()
  devres: Make locking straight forward in release_nodes()
  kernfs: move revalidate to be near lookup
  drivers/base: Constify static attribute_group structs
  firmware_loader: remove unneeded 'comma' macro
  devcoredump: remove contact information
  driver core: Drop helper devm_platform_ioremap_resource_wc()
  component: Rename 'dev' to 'parent'
  component: Drop 'dev' argument to component_match_realloc()
  device property: Don't check for NULL twice in the loops
  driver core: auxiliary bus: Fix typo in the docs
  drivers/base/node.c: make CACHE_ATTR define static DEVICE_ATTR_RO
  debugfs: remove return value of debugfs_create_ulong()
  debugfs: remove return value of debugfs_create_bool()
  scsi: snic: debugfs: remove local storage of debugfs files
  b43: don't save dentries for debugfs
  b43legacy: don't save dentries for debugfs
  ...
2021-07-05 13:51:41 -07:00
Linus Torvalds
bd31b9efbf SCSI misc on 20210702
This series consists of the usual driver updates (ufs, ibmvfc,
 megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
 elx and mpi3mr being new drivers.  The major core change is a rework
 to drop the status byte handling macros and the old bit shifted
 definitions and the rest of the updates are minor fixes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYN7I6iYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXpRAQCkngYZ
 35yQrqOxgOk2pfrysE95tHrV1MfJm2U49NFTwAEAuZutEvBUTfBF+sbcJ06r6q7i
 H0hkJN/Io7enFs5v3WA=
 =zwIa
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This series consists of the usual driver updates (ufs, ibmvfc,
  megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
  elx and mpi3mr being new drivers.

  The major core change is a rework to drop the status byte handling
  macros and the old bit shifted definitions and the rest of the updates
  are minor fixes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (287 commits)
  scsi: aha1740: Avoid over-read of sense buffer
  scsi: arcmsr: Avoid over-read of sense buffer
  scsi: ips: Avoid over-read of sense buffer
  scsi: ufs: ufs-mediatek: Add missing of_node_put() in ufs_mtk_probe()
  scsi: elx: libefc: Fix IRQ restore in efc_domain_dispatch_frame()
  scsi: elx: libefc: Fix less than zero comparison of a unsigned int
  scsi: elx: efct: Fix pointer error checking in debugfs init
  scsi: elx: efct: Fix is_originator return code type
  scsi: elx: efct: Fix link error for _bad_cmpxchg
  scsi: elx: efct: Eliminate unnecessary boolean check in efct_hw_command_cancel()
  scsi: elx: efct: Do not use id uninitialized in efct_lio_setup_session()
  scsi: elx: efct: Fix error handling in efct_hw_init()
  scsi: elx: efct: Remove redundant initialization of variable lun
  scsi: elx: efct: Fix spelling mistake "Unexected" -> "Unexpected"
  scsi: lpfc: Fix build error in lpfc_scsi.c
  scsi: target: iscsi: Remove redundant continue statement
  scsi: qla4xxx: Remove redundant continue statement
  scsi: ppa: Switch to use module_parport_driver()
  scsi: imm: Switch to use module_parport_driver()
  scsi: mpt3sas: Fix error return value in _scsih_expander_add()
  ...
2021-07-02 15:14:36 -07:00
James Smart
66b4d63bdd scsi: lpfc: Fix build error in lpfc_scsi.c
Integration with VMID patches resulted in a build error when
CONFIG_DEBUG_FS is disabled and driver option CONFIG_SCSI_LPFC_DEBUG_FS is
disabled.

It results in an undefined variable:
lpfc_scsi:5595:3: error: 'uuid' undeclared (first use in this function); did you mean 'upid'?

Link: https://lore.kernel.org/r/20210618171842.79710-1-jsmart2021@gmail.com
Fixes: 33c79741de ("scsi: lpfc: vmid: Introduce VMID in I/O path")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-18 23:01:03 -04:00
Zou Wei
4701808360 scsi: lpfc: Use list_move_tail() instead of list_del()/list_add_tail()
Using list_move_tail() instead of list_del() + list_add_tail().

Link: https://lore.kernel.org/r/1623113493-49384-1-git-send-email-zou_wei@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-15 22:25:56 -04:00
Greg Kroah-Hartman
68afbd8459 Linux 5.13-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmDGe+4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG/IUH/iyHVulAtAhL9bnR
 qL4M1kWfcG1sKS2TzGRZzo6YiUABf89vFP90r4sKxG3AKrb8YkTwmJr8B/sWwcsv
 PpKkXXTobbDfpSrsXGEapBkQOE7h2w739XeXyBLRPkoCR4UrEFn68TV2rLjMLBPS
 /EIZkonXLWzzWalgKDP4wSJ7GaQxi3LMx3dGAvbFArEGZ1mPHNlgWy2VokFY/yBf
 qh1EZ5rugysc78JCpTqfTf3fUPK2idQW5gtHSMbyESrWwJ/3XXL9o1ET3JWURYf1
 b0FgVztzddwgULoIGWLxDH5WWts3l54sjBLj0yrLUlnGKA5FjrZb12g9PdhdywuY
 /8KfjeE=
 =JfJm
 -----END PGP SIGNATURE-----

Merge tag 'v5.13-rc6' into driver-core-next

We need the driver core fix in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-14 09:07:45 +02:00
Gaurav Srivastava
33c79741de scsi: lpfc: vmid: Introduce VMID in I/O path
Introduce the VMID in the I/O path. Check if the VMID is enabled and if
the I/O belongs to a VM or not.

Link: https://lore.kernel.org/r/20210608043556.274139-14-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:33 -04:00
Gaurav Srivastava
0c4792c64f scsi: lpfc: vmid: Add QFPA and VMID timeout check in worker thread
Add a periodic check for issuing of QFPA command and VMID timeout in the
worker thread. The inactivity timeout check is added via the timer
function.

Link: https://lore.kernel.org/r/20210608043556.274139-13-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:33 -04:00
Gaurav Srivastava
20397179aa scsi: lpfc: vmid: Timeout implementation for VMID
Implement timeout functionality for the VMID. After the set time period of
inactivity, the VMID is deregistered from the switch.

Link: https://lore.kernel.org/r/20210608043556.274139-12-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:33 -04:00
Gaurav Srivastava
f56e86a082 scsi: lpfc: vmid: Append the VMID to the wqe before sending
Add the VMID in wqe before sending out the request. The type of VMID
depends on the configured type and is checked before being appended.

Link: https://lore.kernel.org/r/20210608043556.274139-11-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:33 -04:00
Gaurav Srivastava
742b0cf87a scsi: lpfc: vmid: Implement CT commands for appid
Implement CT commands for registering and deregistering the appid for the
application. Also, a small change in decrementing the ndlp ref counter has
been added.

Link: https://lore.kernel.org/r/20210608043556.274139-10-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:33 -04:00
Gaurav Srivastava
dc50715e5c scsi: lpfc: vmid: Functions to manage VMIDs
Implement routines to save, retrieve, and remove the VMIDs from the data
structure. A hash table is used to save the VMIDs and the corresponding
UUIDs associated with the application/VMs.

Link: https://lore.kernel.org/r/20210608043556.274139-9-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:33 -04:00
Gaurav Srivastava
7e473de75e scsi: lpfc: vmid: Implement ELS commands for appid
Implement ELS commands QFPA and UVEM for the priority tagging appid
support.

Link: https://lore.kernel.org/r/20210608043556.274139-8-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:32 -04:00
Gaurav Srivastava
5e633302ac scsi: lpfc: vmid: Add support for VMID in mailbox command
Add supporting datastructures for mailbox command which helps in
determining if the firmware supports appid. Allocate resources for VMID at
initialization time and clean them up on removal.

Link: https://lore.kernel.org/r/20210608043556.274139-7-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:32 -04:00
Gaurav Srivastava
7ba2272caa scsi: lpfc: vmid: VMID parameter initialization
Initialize parameters such as type of VMID, max number of VMIDs supported
and timeout value for the VMID registration based on the user input.

Link: https://lore.kernel.org/r/20210608043556.274139-6-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:32 -04:00
Gaurav Srivastava
02169e845d scsi: lpfc: vmid: Add datastructure for supporting VMID in lpfc
Add the primary datastructures needed to implement VMID in the lpfc
driver. Maintain the capability, current state, and hash table for the
vmid/appid along with other information. This implementation supports the
two versions of vmid implementation (app header and priority tagging).

Link: https://lore.kernel.org/r/20210608043556.274139-5-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-10 10:01:32 -04:00
James Smart
696770e72f scsi: lpfc: Fix failure to transmit ABTS on FC link
The abort_cmd_ia flag in an abort wqe describes whether an ABTS basic link
service should be transmitted on the FC link or not.  Code added in
lpfc_sli4_issue_abort_iotag() set the abort_cmd_ia flag incorrectly,
surpressing ABTS transmission.

A previous LPFC change to build an abort wqe inverted prior logic that
determined whether an ABTS was to be issued on the FC link.

Revert this logic to its proper state.

Link: https://lore.kernel.org/r/20210528212240.11387-1-jsmart2021@gmail.com
Fixes: db7531d2b3 ("scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers")
Cc: <stable@vger.kernel.org> # v5.11+
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-31 23:00:42 -04:00
Hannes Reinecke
f2b1e9c6f8 scsi: core: Introduce scsi_build_sense()
Introduce scsi_build_sense() as a wrapper around scsi_build_sense_buffer()
to format the buffer and set the correct SCSI status.

Link: https://lore.kernel.org/r/20210427083046.31620-8-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-31 22:48:21 -04:00
James Smart
e5e0280db7 scsi: lpfc: Update lpfc version to 12.8.0.10
Update lpfc version to 12.8.0.10

Link: https://lore.kernel.org/r/20210514195559.119853-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart
8eced80707 scsi: lpfc: Reregister FPIN types if ELS_RDF is received from fabric controller
FC-LS-5 specifies that a received RDF implies a possible change to fabric
supported diagnostic functions. Endpoints are to re-perform the RDF
exchange with the fabric to enable possible new features or adapt to
changes in values.

This patch adds the logic to RDF receive to re-perform the RDF exchange
with the switch.

Link: https://lore.kernel.org/r/20210514195559.119853-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart
3e49af9393 scsi: lpfc: Add a option to enable interlocked ABTS before job completion
Default behavior for the driver, when aborting an I/O, is to terminate the
I/O with the adapter. The adapter will initiate an ABTS to terminate the
exchange on the link and mark the exchange is terminated so that no further
use of the sgl or any traffic for the exchange is worked on. Completion on
the Abort is then posted to the driver, which as the I/O is terminated can
complete the I/O to the OS. This completion may occur prior to the ABTS
handshake completing on the wire. The ABTS handshake can take a long time
to complete with timeouts and retries reaching 60+ seconds. Note: if
retries fail, LOGO occurs.

Some devices want to ensure that the ABTS handshake fully completes (this
device has fully ack'd it) before the I/O completion is posted back to the
OS, where a failed I/O may be retried via a different path.

To support this behavior, an option was added to the driver to change I/O
completion from the Abort cmd completion to the Exchange termination (aka
ABTS) completion.

Link: https://lore.kernel.org/r/20210514195559.119853-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart
5aa615d195 scsi: lpfc: Fix crash when lpfc_sli4_hba_setup() fails to initialize the SGLs
The driver is encountering a crash in lpfc_free_iocb_list() while
performing initial attachment.

Code review found this to be an errant failure path that was taken, jumping
to a tag that then referenced structures that were uninitialized.

Fix the failure path.

Link: https://lore.kernel.org/r/20210514195559.119853-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart
04c1d9c50a scsi: lpfc: Ignore GID-FT response that may be received after a link flip
When a link bounce happens, there is a possibility that responses to
requests posted prior to the link bounce could be received. This is
problematic as the counter to track reglogin completion after link up can
become out of sync with the real state.

As there is no reason to process a request made in a prior link up context,
eliminate all the disturbance by tagging the request with the event_tag
maintained by the SLI Port for the link. The event_tag will change on every
link state transition.  As long as the tag matches the current event_tag,
the response can be processed. If it doesn't match, just discard the
response.

Link: https://lore.kernel.org/r/20210514195559.119853-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart
fe83e3b9b4 scsi: lpfc: Fix node handling for Fabric Controller and Domain Controller
During link bounce testing, RPI counts were seen to differ from the number
of nodes. For fabric and domain controllers, a temporary RPI is assigned,
but the code isn't registering it. If the nodes do go away, such as on link
down, the temporary RPI isn't being released.

Change the way these two fabric services are managed, make them behave like
any other remote port. Register the RPI and register with the transport.
Never leave the nodes in a NPR or UNUSED state where their RPI is in limbo.
This allows them to follow normal dev_loss_tmo handling, RPI refcounting,
and normal removal rules. It also allows fabric I/Os to use the RPI for
traffic requests.

Note: There is some logic that still has a couple of exceptions when the
Domain controller (0xfffcXX). There are cases where the fabric won't have a
valid login but will send RDP. Other times, it will it send a LOGO then an
RDP. It makes for ad-hoc behavior to manage the node. Exceptions are
documented in the code.

Link: https://lore.kernel.org/r/20210514195559.119853-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:28 -04:00
James Smart
4012baeab6 scsi: lpfc: Fix Node recovery when driver is handling simultaneous PLOGIs
When lpfc is handling a solicited and unsolicited PLOGI with another
initiator, the remote initiator is never recovered. The node for the
initiator is erroneouosly removed and all resources released.

In lpfc_cmpl_els_plogi(), when lpfc_els_retry() returns a failure code, the
driver is calling the state machine with a device remove event because the
remote port is not currently registered with the SCSI or NVMe
transports. The issue is that on a PLOGI "collision" the driver correctly
aborts the solicited PLOGI and allows the unsolicited PLOGI to complete the
process, but this process is interrupted with a device_rm event.

Introduce logic in the PLOGI completion to capture the PLOGI collision
event and jump out of the routine.  This will avoid removal of the node.
If there is no collision, the normal node removal will occur.

Fixes: 	52edb2caf6 ("scsi: lpfc: Remove ndlp when a PLOGI/ADISC/PRLI/REG_RPI ultimately fails")
Cc: <stable@vger.kernel.org> # v5.11+
Link: https://lore.kernel.org/r/20210514195559.119853-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
James Smart
1037e4b4f8 scsi: lpfc: Add ndlp kref accounting for resume RPI path
The driver is crashing due to a bad pointer during driver load due in an
adisc acc receive routine. The driver is missing node get/put in the
mbx_resume_rpi paths.

Fix by adding the proper gets and puts into the resume_rpi path.

Link: https://lore.kernel.org/r/20210514195559.119853-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
James Smart
e30d55137e scsi: lpfc: Fix "Unexpected timeout" error in direct attach topology
An 'unexpected timeout' message may be seen in a point-2-point topology.
The message occurs when a PLOGI is received before the driver is notified
of FLOGI completion. The FLOGI completion failure causes discovery to be
triggered for a second time. The discovery timer is restarted but no new
discovery activity is initiated, thus the timeout message eventually
appears.

In point-2-point, when discovery has progressed before the FLOGI completion
is processed, it is not a failure. Add code to FLOGI completion to detect
that discovery has progressed and exit the FLOGI handling (noop'ing it).

Link: https://lore.kernel.org/r/20210514195559.119853-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
James Smart
fa21189db9 scsi: lpfc: Fix non-optimized ERSP handling
When processing an NVMe ERSP IU which didn't match the optimized CQE-only
path, the status was being left to the WQE status. WQE status is non-zero
as it is indicating a non-optimized completion that needs to be handled by
the driver.

Fix by clearing the status field when falling into the non-optimized
case. Log message added to track optimized vs non-optimized debug.

Link: https://lore.kernel.org/r/20210514195559.119853-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
James Smart
01131e7aae scsi: lpfc: Fix unreleased RPIs when NPIV ports are created
While testing NPIV and watching logins and used RPI levels, it was seen the
used RPI count was much higher than the number of remote ports discovered.

Code inspection showed that remote port removals on any NPIV instance are
releasing the RPI, but not performing an UNREG_RPI with the adapter thus
the reference counting never fully drops and the RPI is never fully
released. This was happening on NPIV nodes due to a log of fabric ELS's to
fabric addresses. This lack of UNREG_RPI was introduced by a prior node
rework patch that performed the UNREG_RPI as part of node cleanup.

To resolve the issue, do the following:

 - Restore the RPI release code, but move the location to so that it is in
   line with the new node cleanup design.

 - NPIV ports now release the RPI and drop the node when the caller sets
   the NLP_RELEASE_RPI flag.

 - Set the NLP_RELEASE_RPI flag in node cleanup which will trigger a
   release of RPI to free pool.

 - Ensure there's an UNREG_RPI at LOGO completion so that RPI release is
   completed.

 - Stop offline_prep from skipping nodes that are UNUSED. The RPI may
   not have been released.

 - Stop the default RPI handling in lpfc_cmpl_els_rsp() for SLI4.

 - Fixed up debugfs RPI displays for better debugging.

Fixes: a70e63eee1 ("scsi: lpfc: Fix NPIV Fabric Node reference counting")
Link: https://lore.kernel.org/r/20210514195559.119853-2-jsmart2021@gmail.com
Cc: <stable@vger.kernel.org> # v5.11+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 23:23:27 -04:00
Shawn Guo
0733d83905 firmware: replace HOTPLUG with UEVENT in FW_ACTION defines
With commit 312c004d36 ("[PATCH] driver core: replace "hotplug" by
"uevent"") already in the tree over a decade, update the name of
FW_ACTION defines to follow semantics, and reflect what the defines are
really meant for, i.e. whether or not generate user space event.

Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Link: https://lore.kernel.org/r/20210425020024.28057-1-shawn.guo@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-13 16:14:45 +02:00
Colin Ian King
52b2599081 scsi: lpfc: Remove redundant assignment to pointer temp_hdr
The pointer tmp_hdr is being assigned a value that is never read, the
assignment is redundant and can be removed.

Link: https://lore.kernel.org/r/20210420104123.376420-1-colin.king@canonical.com
Addresses-Coverity: ("Unused value")
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-10 13:24:06 -04:00
James Smart
e4ec10228f scsi: lpfc: Fix bad memory access during VPD DUMP mailbox command
The dump command for reading a region passes a requested read length
specified in words (4-byte units). The response overwrites the same field
with the actual number of bytes read.

The mailbox handler for DUMP which reads VPD data (region 23) is treating
the response field as if it were still a word_cnt, thus multiplying it by 4
to set the read's "length". Given the read value was calculated based on
the size of the read buffer, the longer response length runs off the end of
the buffer.

Fix by reworking the code to use the response field as a byte count.

Link: https://lore.kernel.org/r/20210421234511.102206-1-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-26 22:58:38 -04:00
James Smart
83adbba746 scsi: lpfc: Fix DMA virtual address ptr assignment in bsg
lpfc_bsg_ct_unsol_event() routine acts assigns a ct_request to the wrong
structure address, resulting in a bad address that results in bsg related
timeouts.

Correct the ct_request assignment to use the kernel virtual buffer address
(not the control structure address).

Link: https://lore.kernel.org/r/20210421234448.102132-1-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-26 22:57:27 -04:00
James Smart
e136471135 scsi: lpfc: Fix illegal memory access on Abort IOCBs
In devloss timer handler and in backend calls to terminate remote port I/O,
there is logic to walk through all active IOCBs and validate them to
potentially trigger an abort request. This logic is causing illegal memory
accesses which leads to a crash. Abort IOCBs, which may be on the list, do
not have an associated lpfc_io_buf struct. The driver is trying to map an
lpfc_io_buf struct on the IOCB and which results in a bogus address thus
the issue.

Fix by skipping over ABORT IOCBs (CLOSE IOCBs are ABORTS that don't send
ABTS) in the IOCB scan logic.

Link: https://lore.kernel.org/r/20210421234433.102079-1-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-26 22:56:31 -04:00
James Smart
cf270817ca scsi: lpfc: Copyright updates for 12.8.0.9 patches
Update copyrights to 2021 for files modified in the 12.8.0.9 patch set.

Link: https://lore.kernel.org/r/20210412013127.2387-17-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:14 -04:00
James Smart
3ebd25b0a4 scsi: lpfc: Update lpfc version to 12.8.0.9
Update lpfc version to 12.8.0.9

Link: https://lore.kernel.org/r/20210412013127.2387-16-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:14 -04:00
James Smart
5b1f5089b6 scsi: lpfc: Eliminate use of LPFC_DRIVER_NAME in lpfc_attr.c
During code inspection, several cases of creating a dynamic attribute names
in logs messages using a define was found. This is unnecessary.

Place the native symbol name in the log messages.

Link: https://lore.kernel.org/r/20210412013127.2387-15-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:14 -04:00
James Smart
f115612528 scsi: lpfc: Standardize discovery object logging format
Code inspection showed lpfc was using three different pointer formats when
logging discovery object pointers.

Standardize the pointer format to x%px.

Note: %px use is limited to discovery objects in order to aid core
analysis.

Link: https://lore.kernel.org/r/20210412013127.2387-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:14 -04:00
James Smart
3bfab8a026 scsi: lpfc: Fix various trivial errors in comments and log messages
Clean up minor issues spotted by tools and code review:

 - Spelling Errors

 - Spurious characters and errors in function headers

 - nvme_info wqerr and err fields source data reversed

 - Extraneous new line in log message 0466

 - Spacing error in log message 0109

 - Messages 0140 and 0141 have portname and nodename reversed

 - Incorrect function labelling in comment

Link: https://lore.kernel.org/r/20210412013127.2387-13-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:14 -04:00
James Smart
b62232ba8c scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logic
SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3
does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have
code to issue the command.  The command will always fail.

Remove the code for the mailbox command and leave only the resulting
"failure path" logic.

Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:14 -04:00
James Smart
d3de0d11a2 scsi: lpfc: Fix lpfc_hdw_queue attribute being ignored
The lpfc_hdw_queue attribute is to set the number of hardware queues to be
created on the adapter. Normally, the value is set to a default, which
allows the hw queue count to be sized dynamically based on adapter
capabilities, CPU/platform architecture, or CPU type. Currently, when
lpfc_hdw_queue is set to a specific value, is has no effect and the dynamic
sizing occurs.

The routine checking whether parameters are default or not ignores the
lpfc_hdw_queue setting and invokes the dynamic logic.

Fix the routine to additionally check the lpfc_hdw_queue attribute value
before using dynamic scaling. Additionally, SLI-3 supports only a small
number of queues with dedicated functions, thus it needs to be exempted
from the variable scaling and set to the expected values.

Link: https://lore.kernel.org/r/20210412013127.2387-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:14 -04:00
James Smart
a314dec37c scsi: lpfc: Fix missing FDMI registrations after Mgmt Svc login
FDMI registration needs to be performed after every login with the FC Mgmt
service. The flag the driver is using to track registration is cleared on
link up, but never on Mgmt service logout/re-login.

Fix by clearing the flag whenever a new login is completed with the FC Mgmt
service.

While perusing the flag use, logging was performed as if FDMI registration
occurred on vports. However, it is limited to the physical port only.
Revise the logging to reflect physical port based.

Link: https://lore.kernel.org/r/20210412013127.2387-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:14 -04:00
James Smart
a1a553e31a scsi: lpfc: Fix silent memory allocation failure in lpfc_sli4_bsg_link_diag_test()
In the unlikely case of a failure to allocate an LPFC_MBOXQ_t structure, no
return status is set, thus the routine never logs an error and returns
success to the callee.

Fix by setting a return code on failure.

Link: https://lore.kernel.org/r/20210412013127.2387-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:13 -04:00
James Smart
724f6b43a3 scsi: lpfc: Fix use-after-free on unused nodes after port swap
During target port swap, the swap logic ignores the DROPPED flag in the
nodes. As a node then moves into the UNUSED state, the reference count will
be dropped. If a node is later reused and moved out of the UNUSED state, an
access can result in a use-after-free assert.

Fix by having the port swap logic propagate the DROPPED flag when switching
nodes. This will avoid reference from being dropped.

Link: https://lore.kernel.org/r/20210412013127.2387-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:13 -04:00
James Smart
304ee43238 scsi: lpfc: Fix error handling for mailboxes completed in MBX_POLL mode
In SLI-4, when performing a mailbox command with MBX_POLL, the driver uses
the BMBX register to send the command rather than the MQ. A flag is set
indicating the BMBX register is active and saves the mailbox job struct
(mboxq) in the mbox_active element of the adapter. The routine then waits
for completion or timeout. The mailbox job struct is not freed by the
routine. In cases of timeout, the adapter will be reset. The
lpfc_sli_mbox_sys_flush() routine will clean up the mbox in preparation for
the reset. It clears the BMBX active flag and marks the job structure as
MBX_NOT_FINISHED. But, it never frees the mboxq job structure. Expectation
in both normal completion and timeout cases is that the issuer of the mbx
command will free the structure.  Unfortunately, not all calling paths are
freeing the memory in cases of error.

All calling paths were looked at and updated, if missing, to free the mboxq
memory regardless of completion status.

Link: https://lore.kernel.org/r/20210412013127.2387-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:13 -04:00
James Smart
4e76d4a9a2 scsi: lpfc: Fix lack of device removal on port swaps with PRLIs
During target port-swap testing with link flips, the initiator could
encounter PRLI errors.  If the target node disappears permanently, the ndlp
is found stuck in UNUSED state with ref count of 1. The rmmod of the driver
will hang waiting for this node to be freed.

While handling a link error in PRLI completion path, the code intends to
skip triggering the discovery state machine. However this is causing the
final reference release path to be skipped. This causes the node to be
stuck with ref count of 1

Fix by ensuring the code path triggers the device removal event on the node
state machine.

Link: https://lore.kernel.org/r/20210412013127.2387-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:13 -04:00
James Smart
a789241e49 scsi: lpfc: Fix NMI crash during rmmod due to circular hbalock dependency
Remove hbalock dependency for lpfc_abts_els_sgl_list and
lpfc_abts_nvmet_ctx_list.  The lists are adaquately synchronized with the
sgl_list_lock and abts_nvmet_buf_list_lock.

Link: https://lore.kernel.org/r/20210412013127.2387-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:13 -04:00
James Smart
f866eb06c0 scsi: lpfc: Fix reference counting errors in lpfc_cmpl_els_rsp()
Call traces are being seen that result from a nodelist structure ref
counting error. They are typically seen after transmission of an LS_RJT ELS
response.

Aged code in lpfc_cmpl_els_rsp() calls lpfc_nlp_not_used() which, if the
ndlp reference count is exactly 1, will decrement the reference count.
Previously lpfc_nlp_put() was within lpfc_els_free_iocb(), and the 'put'
within the free would only be invoked if cmdiocb->context1 was not NULL.
Since the nodelist structure reference count is decremented when exiting
lpfc_cmpl_els_rsp() the lpfc_nlp_not_used() calls are no longer required.
Calling them is causing the reference count issue.

Fix by removing the lpfc_nlp_not_used() calls.

Link: https://lore.kernel.org/r/20210412013127.2387-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:13 -04:00
James Smart
fffd18ec65 scsi: lpfc: Fix crash when a REG_RPI mailbox fails triggering a LOGO response
Fix a crash caused by a double put on the node when the driver completed an
ACC for an unsolicted abort on the same node.  The second put was executed
by lpfc_nlp_not_used() and is wrong because the completion routine executes
the nlp_put when the iocbq was released.  Additionally, the driver is
issuing a LOGO then immediately calls lpfc_nlp_set_state to put the node
into NPR.  This call does nothing.

Remove the lpfc_nlp_not_used call and additional set_state in the
completion routine.  Remove the lpfc_nlp_set_state post issue_logo.  Isn't
necessary.

Link: https://lore.kernel.org/r/20210412013127.2387-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 01:39:13 -04:00