Commit Graph

709 Commits

Author SHA1 Message Date
Qi Liu
20c634932a scsi: hisi_sas: Prevent parallel controller reset and control phy command
A user may issue a control phy command from sysfs at any time, even if the
controller is resetting.

If a phy is disabled by hardreset/linkreset command before calling
get_phys_state() in the reset path, the saved phy state may be incorrect.

To avoid incorrectly recording the phy state, use hisi_hba.sem to ensure
that the controller reset may not run at the same time as when the phy
control function is running.

Link: https://lore.kernel.org/r/1639579061-179473-6-git-send-email-john.garry@huawei.com
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-16 22:59:57 -05:00
John Garry
dc313f6b12 scsi: hisi_sas: Factor out task prep and delivery code
The task prep code is the same between the normal path (in
hisi_sas_task_prep()) and the internal abort path, so factor is out into a
common function.

Link: https://lore.kernel.org/r/1639579061-179473-5-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-16 22:59:57 -05:00
John Garry
08c61b5d90 scsi: hisi_sas: Pass abort structure for internal abort
To help factor out code in future, it's useful to know if we're executing
an internal abort, so pass a pointer to the structure. The idea is that a
NULL pointer means not an internal abort.

Link: https://lore.kernel.org/r/1639579061-179473-4-git-send-email-john.garry@huawei.com
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-16 22:59:57 -05:00
John Garry
934385a4fd scsi: hisi_sas: Make internal abort have no task proto
For an internal abort, the task does not have a protocol, so set to none.

This will make it easier to differentiate internal abort tasks in future.

Link: https://lore.kernel.org/r/1639579061-179473-3-git-send-email-john.garry@huawei.com
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-16 22:59:57 -05:00
John Garry
0e4620856b scsi: hisi_sas: Start delivery hisi_sas_task_exec() directly
Currently we start delivery of commands to the DQ after returning from
hisi_sas_task_exec() with success.

Let's just start delivery directly in that function without having to check
if some local variable is set.

Link: https://lore.kernel.org/r/1639579061-179473-2-git-send-email-john.garry@huawei.com
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-16 22:59:57 -05:00
Christophe JAILLET
4d6942e266 scsi: hisi_sas: Use non-atomic bitmap functions when possible
All uses of the 'hisi_hba->slot_index_tags' bitmap are protected with the
'hisi_hba->lock' spinlock.

Prefer the non-atomic '__[set|clear]_bit()' functions to save a few cycles.

Link: https://lore.kernel.org/r/8ee33e463523db080e6a2c06f332e47abb69359b.1637961191.git.christophe.jaillet@wanadoo.fr
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-06 22:12:34 -05:00
Christophe JAILLET
d43efddf62 scsi: hisi_sas: Remove some useless code in hisi_sas_alloc()
The 'hisi_hba->slot_index_tags' bitmap is allocated with bitmap_zalloc() so
it is already cleared. There is no need to clear it another time, one bit
at a time.

Remove the corresponding useless code.

Link: https://lore.kernel.org/r/41c86e7e3e05a13bd586d8ee1b81296140b7a6eb.1637961191.git.christophe.jaillet@wanadoo.fr
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-06 22:12:33 -05:00
Christophe JAILLET
54585ec62f scsi: hisi_sas: Use devm_bitmap_zalloc() when applicable
'hisi_hba->slot_index_tags' is a bitmap. Use 'devm_bitmap_zalloc()' to
simplify code, improve the semantic, and avoid some open-coded arithmetic
in allocator arguments.

Link: https://lore.kernel.org/r/4afa3f71e66c941c660627c7f5b0223b51968ebb.1637961191.git.christophe.jaillet@wanadoo.fr
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-12-06 22:12:33 -05:00
Bart Van Assche
db33028647 scsi: Remove superfluous #include <linux/async.h> directives
Remove this include directive from code that does not use any functionality
from kernel/async.c.

Link: https://lore.kernel.org/r/20211129194609.3466071-13-bvanassche@acm.org
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-11-29 23:02:15 -05:00
Linus Torvalds
fe91c4725a SCSI misc on 20211105
This series consists of the usual driver updates (ufs, smartpqi, lpfc,
 target, megaraid_sas, hisi_sas, qla2xxx) and minor updates and bug
 fixes.  Notable core changes are the removal of scsi->tag which caused
 some churn in obsolete drivers and a sweep through all drivers to call
 scsi_done() directly instead of scsi->done() which removes a pointer
 indirection from the hot path and a move to register core sysfs files
 earlier, which means they're available to KOBJ_ADD processing, which
 necessitates switching all drivers to using attribute groups.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYYUfBCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbUJAQDZt4oc
 vUx9JpyrdHxxTCuOzVFd8W1oJn0k5ltCBuz4yAD8DNbGhGm93raMSJ3FOOlzLEbP
 RG8vBdpxMudlvxAPi/A=
 =BSFz
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This consists of the usual driver updates (ufs, smartpqi, lpfc,
  target, megaraid_sas, hisi_sas, qla2xxx) and minor updates and bug
  fixes.

  Notable core changes are the removal of scsi->tag which caused some
  churn in obsolete drivers and a sweep through all drivers to call
  scsi_done() directly instead of scsi->done() which removes a pointer
  indirection from the hot path and a move to register core sysfs files
  earlier, which means they're available to KOBJ_ADD processing, which
  necessitates switching all drivers to using attribute groups"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
  scsi: lpfc: Update lpfc version to 14.0.0.3
  scsi: lpfc: Allow fabric node recovery if recovery is in progress before devloss
  scsi: lpfc: Fix link down processing to address NULL pointer dereference
  scsi: lpfc: Allow PLOGI retry if previous PLOGI was aborted
  scsi: lpfc: Fix use-after-free in lpfc_unreg_rpi() routine
  scsi: lpfc: Correct sysfs reporting of loop support after SFP status change
  scsi: lpfc: Wait for successful restart of SLI3 adapter during host sg_reset
  scsi: lpfc: Revert LOG_TRACE_EVENT back to LOG_INIT prior to driver_resource_setup()
  scsi: ufs: ufshcd-pltfrm: Fix memory leak due to probe defer
  scsi: ufs: mediatek: Avoid sched_clock() misuse
  scsi: mpt3sas: Make mpt3sas_dev_attrs static
  scsi: scsi_transport_sas: Add 22.5 Gbps link rate definitions
  scsi: target: core: Stop using bdevname()
  scsi: aha1542: Use memcpy_{from,to}_bvec()
  scsi: sr: Add error handling support for add_disk()
  scsi: sd: Add error handling support for add_disk()
  scsi: target: Perform ALUA group changes in one step
  scsi: target: Replace lun_tg_pt_gp_lock with rcu in I/O path
  scsi: target: Fix alua_tg_pt_gps_count tracking
  scsi: target: Fix ordered tag handling
  ...
2021-11-05 08:42:02 -07:00
Christoph Hellwig
3ab0bc78e9 block: drop unused includes in <linux/blkdev.h>
Drop various include not actually used in blkdev.h itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-14-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18 06:17:01 -06:00
Bart Van Assche
62ac8ccbb8 scsi: hisi_sas: Switch to attribute groups
struct device supports attribute groups directly but does not support
struct device_attribute directly. Hence switch to attribute groups.

Link: https://lore.kernel.org/r/20211012233558.4066756-21-bvanassche@acm.org
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-16 21:45:55 -04:00
Luo Jiaxing
21c7e97247 scsi: hisi_sas: Disable SATA disk phy for severe I_T nexus reset failure
If the softreset fails in the I_T reset, libsas will then continue to issue
a controller reset to try to recover.

However a faulty disk may cause the softreset to fail, and resetting the
controller will not help this scenario. Indeed, we will just continue the
cycle of error handle handling to try to recover.

So if the softreset fails upon certain conditions, just disable the phy
associated with the disk. The user needs to handle this problem.

Link: https://lore.kernel.org/r/1634041588-74824-5-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12 22:46:07 -04:00
Xiang Chen
046ab7d0f5 scsi: hisi_sas: Wait for phyup in hisi_sas_control_phy()
When issuing a hardreset/linkreset/phy_set_linkrate from sysfs, the phy
will be disabled and re-enabled for the directly attached scenario.

It takes some time for the phy to come back up after re-enabling the phy.
If the controller becomes suspended while waiting for the phy to come back,
the phy up may be lost (along with the disk).

To solve this problem, wait for the phy up to occur with a timeout. Indeed
this is already done in hisi_sas_debug_I_T_nexus_reset() for local phys, so
just relocate the functionality to hisi_sas_control_phy().

Since the HA workqueue is drained when suspending the controller, and the
phy control function is called from the same workqueue, we can guarantee
that the controller will not be suspended during this period.

Link: https://lore.kernel.org/r/1634041588-74824-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12 22:46:06 -04:00
Xiang Chen
36c6b7613e scsi: hisi_sas: Initialise devices in .slave_alloc callback
Perform driver-specific SCSI device initialization in the designated SCSI
midlayer callback instead of relying on the libsas "device found" callback.

The SCSI midlayer .slave_alloc interface is called prior to sending any I/O
to the device.

Link: https://lore.kernel.org/r/1634041588-74824-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-12 22:46:06 -04:00
Luo Jiaxing
9aec5ffa6e scsi: hisi_sas: Increase debugfs_dump_index after dump is completed
The hisi_hba debugfs_dump_index member should increased after a dump
insertion completed, and not before it has started, so fix the code to do
so.

Link: https://lore.kernel.org/r/1629799260-120116-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 23:56:02 -04:00
Xiang Chen
080b4f976b scsi: hisi_sas: Replace del_timer() calls with del_timer_sync()
Some usage of del_timer() in the driver is potentially unsafe.

When running the sas_task->slow_task timer in
hisi_sas_exec_internal_tmf_task(), execution may be blocked in function
hisi_sas_task_exec(); so it is possible that the timer is running when the
callback to disable the timer is running. This could be dangerous, as we
immediately release resources which the timer callback uses after disabling
the timer. The same situation may be found at other sites, such as
_hisi_sas_internal_task_abort().

Change calls to del_timer() to del_timer_sync() as necessary, to ensure any
timer has finished when disabling.

Also remove calls to timer_pending() prior to del_timer() as it is not
necessary.

Link: https://lore.kernel.org/r/1629799260-120116-5-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 23:56:02 -04:00
Luo Jiaxing
b5a9fa20e3 scsi: hisi_sas: Rename HISI_SAS_{RESET -> RESETTING}_BIT
HISI_SAS_RESET_BIT means that the controller is being reset, and so the
name is a bit vague. Rename it to HISI_SAS_RESETTING_BIT.

Link: https://lore.kernel.org/r/1629799260-120116-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 23:56:02 -04:00
John Garry
089226ef6a scsi: hisi_sas: Stop printing queue count in v3 hardware probe
The number of hardware queues is available from sysfs. Remove the print in
the v3 hardware probe function.

Link: https://lore.kernel.org/r/1629799260-120116-3-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 23:56:02 -04:00
Xiang Chen
4f6094f166 scsi: hisi_sas: Use managed PCI functions
Use managed PCI functions such as pcim_enable_device() and
pcim_iomap_regions() to simplify exception handling code.

Link: https://lore.kernel.org/r/1629799260-120116-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13 23:56:02 -04:00
Bart Van Assche
1effbface9 scsi: hisi_sas: Use scsi_cmd_to_rq() instead of scsi_cmnd.request
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.

Link: https://lore.kernel.org/r/20210809230355.8186-22-bvanassche@acm.org
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-08-11 22:25:39 -04:00
Linus Torvalds
8b9cc17a46 SCSI misc on 20210711
This is a set of minor fixes and clean ups in the core and various
 drivers.  The only core change in behaviour is the I/O retry for
 spinup notify, but that shouldn't impact anything other than the
 failing case.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYOqWWyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYBtAQCpqVdl
 Axi1SpD6/UuKOgRmboWscoKD8FLHwvLDMRyCRQEAnLu3XdB9HcQrwZOkTG14vrfB
 q2XB5cP4XAITxFLN1qo=
 =9AO9
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "This is a set of minor fixes and clean ups in the core and various
  drivers.

  The only core change in behaviour is the I/O retry for spinup notify,
  but that shouldn't impact anything other than the failing case"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (23 commits)
  scsi: virtio_scsi: Add validation for residual bytes from response
  scsi: ipr: System crashes when seeing type 20 error
  scsi: core: Retry I/O for Notify (Enable Spinup) Required error
  scsi: mpi3mr: Fix warnings reported by smatch
  scsi: qedf: Add check to synchronize abort and flush
  scsi: MAINTAINERS: Add mpi3mr driver maintainers
  scsi: libfc: Fix array index out of bound exception
  scsi: mvsas: Use DEVICE_ATTR_RO()/RW() macro
  scsi: megaraid_mbox: Use DEVICE_ATTR_ADMIN_RO() macro
  scsi: qedf: Use DEVICE_ATTR_RO() macro
  scsi: qedi: Use DEVICE_ATTR_RO() macro
  scsi: message: mptfc: Switch from pci_ to dma_ API
  scsi: be2iscsi: Fix some missing space in some messages
  scsi: be2iscsi: Fix an error handling path in beiscsi_dev_probe()
  scsi: ufs: Fix build warning without CONFIG_PM
  scsi: bnx2fc: Remove meaningless bnx2fc_abts_cleanup() return value assignment
  scsi: qla2xxx: Add heartbeat check
  scsi: virtio_scsi: Do not overwrite SCSI status
  scsi: libsas: Add LUN number check in .slave_alloc callback
  scsi: core: Inline scsi_mq_alloc_queue()
  ...
2021-07-11 10:59:53 -07:00
Linus Torvalds
bd31b9efbf SCSI misc on 20210702
This series consists of the usual driver updates (ufs, ibmvfc,
 megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
 elx and mpi3mr being new drivers.  The major core change is a rework
 to drop the status byte handling macros and the old bit shifted
 definitions and the rest of the updates are minor fixes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYN7I6iYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXpRAQCkngYZ
 35yQrqOxgOk2pfrysE95tHrV1MfJm2U49NFTwAEAuZutEvBUTfBF+sbcJ06r6q7i
 H0hkJN/Io7enFs5v3WA=
 =zwIa
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This series consists of the usual driver updates (ufs, ibmvfc,
  megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
  elx and mpi3mr being new drivers.

  The major core change is a rework to drop the status byte handling
  macros and the old bit shifted definitions and the rest of the updates
  are minor fixes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (287 commits)
  scsi: aha1740: Avoid over-read of sense buffer
  scsi: arcmsr: Avoid over-read of sense buffer
  scsi: ips: Avoid over-read of sense buffer
  scsi: ufs: ufs-mediatek: Add missing of_node_put() in ufs_mtk_probe()
  scsi: elx: libefc: Fix IRQ restore in efc_domain_dispatch_frame()
  scsi: elx: libefc: Fix less than zero comparison of a unsigned int
  scsi: elx: efct: Fix pointer error checking in debugfs init
  scsi: elx: efct: Fix is_originator return code type
  scsi: elx: efct: Fix link error for _bad_cmpxchg
  scsi: elx: efct: Eliminate unnecessary boolean check in efct_hw_command_cancel()
  scsi: elx: efct: Do not use id uninitialized in efct_lio_setup_session()
  scsi: elx: efct: Fix error handling in efct_hw_init()
  scsi: elx: efct: Remove redundant initialization of variable lun
  scsi: elx: efct: Fix spelling mistake "Unexected" -> "Unexpected"
  scsi: lpfc: Fix build error in lpfc_scsi.c
  scsi: target: iscsi: Remove redundant continue statement
  scsi: qla4xxx: Remove redundant continue statement
  scsi: ppa: Switch to use module_parport_driver()
  scsi: imm: Switch to use module_parport_driver()
  scsi: mpt3sas: Fix error return value in _scsih_expander_add()
  ...
2021-07-02 15:14:36 -07:00
Yufen Yu
49da96d779 scsi: libsas: Add LUN number check in .slave_alloc callback
Offlining a SATA device connected to a hisi SAS controller and then
scanning the host will result in detecting 255 non-existent devices:

  # lsscsi
  [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
  [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
  [2:0:2:0]    disk    SEAGATE  ST600MM0006      B001  /dev/sdc
  # echo "offline" > /sys/block/sdb/device/state
  # echo "- - -" > /sys/class/scsi_host/host2/scan
  # lsscsi
  [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
  [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
  [2:0:1:1]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdh
  ...
  [2:0:1:255]  disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdjb

After a REPORT LUN command issued to the offline device fails, the SCSI
midlayer tries to do a sequential scan of all devices whose LUN number is
not 0. However, SATA does not support LUN numbers at all.

Introduce a generic sas_slave_alloc() handler which will return -ENXIO for
SATA devices if the requested LUN number is larger than 0 and make libsas
drivers use this function as their .slave_alloc callback.

Link: https://lore.kernel.org/r/20210622034037.1467088-1-yuyufen@huawei.com
Reported-by: Wu Bo <wubo40@huawei.com>
Suggested-by: John Garry <john.garry@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-22 21:33:33 -04:00
Luo Jiaxing
e8a4d0daae scsi: hisi_sas: Speed up error handling when internal abort timeout occurs
If an internal task abort timeout occurs, the controller has developed a
fault, and needs to be reset to be recovered.

When this occurs during error handling, the current policy is to allow
error handling to continue, and the inevitable nexus ha reset will handle
the required reset.

However various steps of error handling need to taken before this happens.
These also involve some level of HW interaction, which will also fail with
various timeouts.

Speed up this process by recording a HW fault bit for an internal abort
timeout - when this is set, just automatically error any HW interaction,
and essentially go straight to clear nexus ha (to reset the controller).

Link: https://lore.kernel.org/r/1623058179-80434-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-09 23:21:52 -04:00
Luo Jiaxing
63ece9eb35 scsi: hisi_sas: Reset controller for internal abort timeout
If an internal task abort timeout occurs, the controller has developed a
fault, and needs to be reset to be recovered. However if a timeout occurs
during SCSI error handling, issuing a controller reset immediately may
conflict with the error handling.

To handle internal abort in these two scenarios, only queue the reset when
not in an error handling function. In the case of a timeout during error
handling, do nothing and rely on the inevitable ha nexus reset to reset the
controller.

Link: https://lore.kernel.org/r/1623058179-80434-5-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-09 23:21:51 -04:00
Luo Jiaxing
2f12a49951 scsi: hisi_sas: Include HZ in timer macros
Include HZ in timer macros to make the code more concise.

Link: https://lore.kernel.org/r/1623058179-80434-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-09 23:21:51 -04:00
Luo Jiaxing
0f75733991 scsi: hisi_sas: Run I_T nexus resets in parallel for clear nexus reset
For a clear nexus reset operation, the I_T nexus resets are executed
serially for each device. For devices attached through an expander, this
may take 2s per device; so, in total, could take a long time.

Reduce the total time by running the I_T nexus resets in parallel through
async operations.

Link: https://lore.kernel.org/r/1623058179-80434-3-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-09 23:21:51 -04:00
Luo Jiaxing
366da0da1f scsi: hisi_sas: Put a limit of link reset retries
If an OOB event is received but the phy still fails to come up, a link
reset will be issued repeatedly at an interval of 20s until the phy comes
up.

Set a limit for link reset issue retries to avoid printing the timeout
message endlessly.

Link: https://lore.kernel.org/r/1623058179-80434-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-09 23:21:51 -04:00
Bart Van Assche
d377f415dd scsi: libsas: Introduce more SAM status code aliases in enum exec_status
This patch prepares for converting SAM status codes into an enum. Without
this patch converting SAM status codes into an enumeration type would
trigger complaints about enum type mismatches for the SAS code.

Link: https://lore.kernel.org/r/20210524025457.11299-2-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 16:10:46 -04:00
Yang Yingliang
7907a021e4 scsi: hisi_sas: Drop free_irq() of devm_request_irq() allocated irq
irqs allocated with devm_request_irq() should not be freed using
free_irq(). Doing so causes a dangling pointer and a subsequent double
free.

Link: https://lore.kernel.org/r/20210519130519.2661938-1-yangyingliang@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 22:46:55 -04:00
Sergey Shtylyov
ab17122e75 scsi: hisi_sas: Propagate errors in interrupt_init_v1_hw()
After commit 6c11dc0604 ("scsi: hisi_sas: Fix IRQ checks") we have the
error codes returned by platform_get_irq() ready for the propagation
upsream in interrupt_init_v1_hw() -- that will fix still broken deferred
probing. Let's propagate the error codes from devm_request_irq() as well
since I don't see the reason to override them with -ENOENT...

Link: https://lore.kernel.org/r/49ba93a3-d427-7542-d85a-b74fe1a33a73@omp.ru
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-21 17:04:13 -04:00
Sergey Shtylyov
6c11dc0604 scsi: hisi_sas: Fix IRQ checks
Commit df2d8213d9 ("hisi_sas: use platform_get_irq()") failed to take
into account that irq_of_parse_and_map() and platform_get_irq() have a
different way of indicating an error: the former returns 0 and the latter
returns a negative error code. Fix up the IRQ checks!

Link: https://lore.kernel.org/r/810f26d3-908b-1d6b-dc5c-40019726baca@omprussia.ru
Fixes: df2d8213d9 ("hisi_sas: use platform_get_irq()")
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-13 00:16:42 -04:00
Luo Jiaxing
f4df167ad5 scsi: hisi_sas: Print SATA device SAS address for soft reset failure
Add (pseudo) SAS address for ATA software reset failure log to assist in
debugging.

Link: https://lore.kernel.org/r/1617709711-195853-7-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12 23:21:26 -04:00
Luo Jiaxing
2d31cb20a3 scsi: hisi_sas: Warn in v3 hw channel interrupt handler when status reg cleared
If a channel interrupt occurs without any status bit set, the handler will
return directly. However, if such redundant interrupts are received, it's
better to check what happen, so add logs for this.

Link: https://lore.kernel.org/r/1617709711-195853-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: Yihang Li <liyihang6@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12 23:21:26 -04:00
Jianqin Xie
2c74cb1f92 scsi: hisi_sas: Directly snapshot registers when executing a reset
The debugfs snapshot should be executed before the reset occurs to ensure
that the register contents are saved properly.

As such, it is incorrect to queue the debugfs dump when running a reset as
the reset will occur prior to the snapshot work item is handler.

Therefore, directly snapshot registers in the reset work handler.

Link: https://lore.kernel.org/r/1617709711-195853-5-git-send-email-john.garry@huawei.com
Signed-off-by: Jianqin Xie <xiejianqin@hisilicon.com>
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12 23:21:26 -04:00
Xiang Chen
f467666504 scsi: hisi_sas: Call sas_unregister_ha() to roll back if .hw_init() fails
Function sas_unregister_ha() needs to be called to roll back if
hisi_hba->hw->hw_init() fails in function hisi_sas_probe() or
hisi_sas_v3_probe(). Make that change.

Link: https://lore.kernel.org/r/1617709711-195853-4-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12 23:21:26 -04:00
Luo Jiaxing
4da0b7f6fa scsi: hisi_sas: Print SAS address for v3 hw erroneous completion print
To help debugging efforts, print the device SAS address for v3 hw erroneous
completion log.

Here is an example print:

hisi_sas_v3_hw 0000:b4:02.0: erroneous completion iptt=2193 task=000000002b0c13f8 dev id=17 addr=570fd45f9d17b001

Link: https://lore.kernel.org/r/1617709711-195853-3-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12 23:21:26 -04:00
Luo Jiaxing
2843d2fb42 scsi: hisi_sas: Delete some unused callbacks
The debugfs code has been relocated to v3 hw driver, so delete unused
struct hisi_sas_hw function pointers snapshot_{prepare, restore}.

Link: https://lore.kernel.org/r/1617709711-195853-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-12 23:21:26 -04:00
Luo Jiaxing
cd96fe600c scsi: hisi_sas: Add trace FIFO debugfs support
The controller provides trace FIFO DFX tool to assist link fault debugging
and link optimization. This tool can be helpful when debugging link faults
without SAS analyzers. Each PHY has an independent trace FIFO interface.

The user can configure the trace FIFO tool of one PHY by using the
following six interfaces:

signal_sel: select signal group applies to different scenarios.
  0x0: linkrate negotiation
  0x1: Host 12G TX train
  0x2: Disk 12G TX train
  0x3: SAS PHY CTRL DFX 0
  0x4: SAS PHY CTRL DFX 1
  0x5: SAS PCS DFX
  other: linkrate negotiation
dump_mask: The masked hardware status bit will not be updated.
dump_mode: determines how to dump data after trigger signal is generated.
  0x0: dump forever
  0x1: dump 32 data after trigger signal is generated
  0x2: no more dump after trigger signal is generated
trigger_mode: determines the trigger mode, level or edge.
  0x0: dump when trigger signal changed
  0x1: dump when trigger signal's level equal to trigger_level
  0x2: dump when trigger signal's level different from trigger_level
trigger_level: determines the trigger level.
trigger_msk: mask trigger signal

The user can get 32-byte values from hardware by reading the rd_data.
These values consitute the status record of the hardware at different time
points.

Link: https://lore.kernel.org/r/1611659068-131975-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-26 23:02:11 -05:00
Luo Jiaxing
6834ec8b23 scsi: hisi_sas: Flush workqueue in hisi_sas_v3_remove()
If the controller reset occurs at the same time as driver removal, it may
be possible that the interrupts have been released prior to the host
softreset, and calling pci_irq_vector() there causes a WARN:

WARNING: CPU: 37 PID: 1542 /pci/msi.c:1275 pci_irq_vector+0xc0/0xd0
Call trace:
pci_irq_vector+0xc0/0xd0
disable_host_v3_hw+0x58/0x5b0 [hisi_sas_v3_hw]
soft_reset_v3_hw+0x40/0xc0 [hisi_sas_v3_hw]
hisi_sas_controller_reset+0x150/0x260 [hisi_sas_main]
hisi_sas_rst_work_handler+0x3c/0x58 [hisi_sas_main]

To fix, flush the driver workqueue prior to releasing the interrupts to
ensure any resets have been completed.

Link: https://lore.kernel.org/r/1611659068-131975-5-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-26 23:02:11 -05:00
Luo Jiaxing
1dbe61bf7d scsi: hisi_sas: Enable debugfs support by default
Add a config option to enable debugfs support by default. And if debugfs
support is enabled by default, dump count default value is increased to 50
as generally users want something bigger than the current default in that
situation.

Link: https://lore.kernel.org/r/1611659068-131975-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-26 23:02:11 -05:00
John Garry
69bfa5fd7b scsi: hisi_sas: Don't check .nr_hw_queues in hisi_sas_task_prep()
Now that v2 and v3 hw expose their HW queues (and so shost.nr_hw_queues is
set), remove the conditional checks in hisi_sas_task_prep().

This change would affect v1 HW performance (as it does not expose HW
queues), but nobody uses it and support may be dropped soon.

Link: https://lore.kernel.org/r/1611659068-131975-3-git-send-email-john.garry@huawei.com
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-26 23:02:11 -05:00
John Garry
4d287d8bae scsi: hisi_sas: Remove deferred probe check in hisi_sas_v2_probe()
The platform_get_irq() check for -EPROBE_DEFER was to ensure that all the
steps to add the SCSI host are not done and then only to realise that the
probe needs to be deferred.

However, since there is now an earlier check for this in
hisi_sas_interrupt_preinit(), this check is superfluous and may be removed.

Link: https://lore.kernel.org/r/1611659068-131975-2-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-26 23:02:11 -05:00
Ahmed S. Darwish
872a90b5b4 scsi: hisi_sas: Switch back to original libsas event notifiers
libsas event notifiers required an extension where gfp_t flags must be
explicitly passed. For bisectability, a temporary _gfp() variant of such
functions were added. All call sites then got converted use the _gfp()
variants and explicitly pass GFP context. Having no callers left, the
original libsas notifiers were then modified to accept gfp_t flags by
default.

Switch back to the original libas API, while still passing GFP context.
The libsas _gfp() variants will be removed afterwards.

Link: https://lore.kernel.org/r/20210118100955.1761652-14-a.darwish@linutronix.de
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:09 -05:00
Ahmed S. Darwish
26c7efc3f9 scsi: hisi_sas: Pass gfp_t flags to libsas event notifiers
Use the new libsas event notifiers API, which requires callers to
explicitly pass the gfp_t memory allocation flags.

Below are the context analysis for modified functions:

=> hisi_sas_bytes_dmaed():

Since it is invoked from both process and atomic contexts, let its callers
pass the gfp_t flags:

  * hisi_sas_main.c:
  ------------------

    hisi_sas_phyup_work(): workqueue context
      -> hisi_sas_bytes_dmaed(..., GFP_KERNEL)

    hisi_sas_controller_reset_done(): has an msleep()
      -> hisi_sas_rescan_topology()
        -> hisi_sas_phy_down()
          -> hisi_sas_bytes_dmaed(..., GFP_KERNEL)

    hisi_sas_debug_I_T_nexus_reset(): calls wait_for_completion_timeout()
      -> hisi_sas_phy_down()
        -> hisi_sas_bytes_dmaed(..., GFP_KERNEL)

  * hisi_sas_v1_hw.c:
  -------------------

    int_abnormal_v1_hw(): irq handler
      -> hisi_sas_phy_down()
        -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC)

  * hisi_sas_v[23]_hw.c:
  ----------------------

    int_phy_updown_v[23]_hw(): irq handler
      -> phy_down_v[23]_hw()
        -> hisi_sas_phy_down()
          -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC)

=> int_bcast_v1_hw() and phy_bcast_v3_hw():

Both are invoked exclusively from irq handlers. Pass GFP_ATOMIC.

Link: https://lore.kernel.org/r/20210118100955.1761652-12-a.darwish@linutronix.de
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:08 -05:00
John Garry
121181f3f8 scsi: libsas: Remove notifier indirection
LLDDs report events to libsas with .notify_port_event and .notify_phy_event
callbacks.

These callbacks are fixed and so there is no reason why the functions
cannot be called directly, so do that.

This neatens the code slightly, makes it more obvious, and reduces function
pointer usage, which is generally a good thing. Downside is that there are
2x more symbol exports.

[a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables]

Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 21:31:07 -05:00
Martin K. Petersen
938a2fbefb Merge branch '5.11/scsi-fixes' into 5.12/scsi-queue
Pull in the 5.11 SCSI fixes branch to provide an updated baseline for
megaraid and hisi_sas. Both drivers received core changes in
v5.11-rc3.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-20 18:26:06 -05:00
John Garry
3997e0fdd5 scsi: hisi_sas: Remove auto_affine_msi_experimental module_param
Now that the driver always uses managed interrupts, delete
auto_affine_msi_experimental module param.

Link: https://lore.kernel.org/r/1609763622-34119-2-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 22:52:46 -05:00
Martin K. Petersen
a8f808839a Merge branch '5.11/scsi-postmerge' into 5.11/scsi-fixes
Merge two commits that had dependencies on other 5.11 trees (the block
and the irq trees respectively).

 - We reverted a megaraid_sas change in 5.10 due to missing block
   layer plumbing. Now that this is in place, reinstate the change.

 - The hisi_sas driver had a dependency on a driver core irq change
   that went in through Thomas' tree.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-04 13:27:39 -05:00
John Garry
74a2921948 scsi: hisi_sas: Expose HW queues for v2 hw
As a performance enhancement, make the completion queue interrupts managed.

In addition, in commit bf0beec060 ("blk-mq: drain I/O when all CPUs in a
hctx are offline"), CPU hotplug for MQ devices using managed interrupts is
made safe. So expose HW queues to blk-mq to take advantage of this.

Flag Scsi_host.host_tagset is also set to ensure that the HBA is not sent
more commands than it can handle. However the driver still does not use
request tag for IPTT as there are many HW bugs means that special rules
apply for IPTT allocation.

Link: https://lore.kernel.org/r/1606905417-183214-6-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-21 22:21:05 -05:00
Linus Torvalds
60f7c503d9 SCSI misc on 20201216
This series consists of the usual driver updates (ufs, qla2xxx,
 smartpqi, target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of
 cleanups, a major power management rework and a load of assorted minor
 updates.  There are a few core updates (formatting fixes being the big
 one) but nothing major this cycle.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX9o0KSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbOZAP9D5NTN
 J7dJUo2MIMy84YBu+d9ag7yLlNiRWVY2yw5vHwD/Z7JjAVLwz/tzmyjU9//o2J6w
 hwhOv6Uto89gLCWSEz8=
 =KUPT
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This consists of the usual driver updates (ufs, qla2xxx, smartpqi,
  target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of cleanups, a major
  power management rework and a load of assorted minor updates.

  There are a few core updates (formatting fixes being the big one) but
  nothing major this cycle"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
  scsi: mpt3sas: Update driver version to 36.100.00.00
  scsi: mpt3sas: Handle trigger page after firmware update
  scsi: mpt3sas: Add persistent MPI trigger page
  scsi: mpt3sas: Add persistent SCSI sense trigger page
  scsi: mpt3sas: Add persistent Event trigger page
  scsi: mpt3sas: Add persistent Master trigger page
  scsi: mpt3sas: Add persistent trigger pages support
  scsi: mpt3sas: Sync time periodically between driver and firmware
  scsi: qla2xxx: Update version to 10.02.00.104-k
  scsi: qla2xxx: Fix device loss on 4G and older HBAs
  scsi: qla2xxx: If fcport is undergoing deletion complete I/O with retry
  scsi: qla2xxx: Fix the call trace for flush workqueue
  scsi: qla2xxx: Fix flash update in 28XX adapters on big endian machines
  scsi: qla2xxx: Handle aborts correctly for port undergoing deletion
  scsi: qla2xxx: Fix N2N and NVMe connect retry failure
  scsi: qla2xxx: Fix FW initialization error on big endian machines
  scsi: qla2xxx: Fix crash during driver load on big endian machines
  scsi: qla2xxx: Fix compilation issue in PPC systems
  scsi: qla2xxx: Don't check for fw_started while posting NVMe command
  scsi: qla2xxx: Tear down session if FW say it is down
  ...
2020-12-16 13:34:31 -08:00
Xiang Chen
359db63378 scsi: hisi_sas: Select a suitable queue for internal I/Os
For when managed interrupts are used (and shost->nr_hw_queues is set), a
fixed queue - set per-device - is still used for internal I/Os.

If all the CPUs mapped to that queue are offlined, then the completions for
that queue are not serviced and any internal I/Os will time out.

Fix by selecting a queue for internal I/Os from the queue mapped from the
current CPU in this scenario.

This is still not ideal as it does not deal with CPU hotplug for inflight
internal I/Os, and needs proper support from [0].

[0] https://lore.kernel.org/linux-scsi/20200703130122.111448-1-hare@suse.de/T/#m7d77d049b18f33a24ef206af69ebb66d07440556

Link: https://lore.kernel.org/r/1607347855-59091-1-git-send-email-john.garry@huawei.com
Fixes: 8d98416a55 ("scsi: hisi_sas: Switch v3 hw to MQ")
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-07 21:23:51 -05:00
Ahmed S. Darwish
18577cdcae scsi: hisi_sas: Remove preemptible()
hisi_sas_task_exec() uses preemptible() to see if it's safe to block.  This
does not work for CONFIG_PREEMPT_COUNT=n kernels in which preemptible()
always returns 0.

The problem is masked when enabling some of the common Kconfig.debug
options (like CONFIG_DEBUG_ATOMIC_SLEEP), as they implicitly enable the
preemption counter.

In general, driver leaf functions should not make logic decisions based on
the context they're called from. The caller should be the entity
responsible for explicitly indicating context.

Since hisi_sas_task_exec() already has a gfp_t flags parameter, use it as
the explicit context marker.

Link: https://lore.kernel.org/r/20201126132952.2287996-3-bigeasy@linutronix.de
Fixes: 214e702d4b ("scsi: hisi_sas: Adjust task reject period during host reset")
Fixes: 550c0d89d5 ("scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()")
Cc: Xiaofei Tan <tanxiaofei@huawei.com>
Cc: Xiang Chen <chenxiang66@hisilicon.com>
Cc: John Garry <john.garry@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-01 00:03:52 -05:00
Luo Jiaxing
623a4b6d5c scsi: hisi_sas: Move debugfs code to v3 hw driver
Relocate all the debugfs code for DFX to v3 hw since no other versions
support it.

Link: https://lore.kernel.org/r/1606207594-196362-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-30 23:42:16 -05:00
Xiang Chen
2ebde94f2e scsi: hisi_sas: Fix up probe error handling for v3 hw
Fix some rollbacks in function hisi_sas_v3_probe() and
interrupt_init_v3_hw().

Link: https://lore.kernel.org/r/1606207594-196362-3-git-send-email-john.garry@huawei.com
Fixes: 8d98416a55 ("scsi: hisi_sas: Switch v3 hw to MQ")
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-30 23:41:31 -05:00
John Garry
bec99e5250 scsi: hisi_sas: Reduce some indirection in v3 hw driver
Sometimes local functions are called indirectly from the hw driver, which
only makes the code harder to follow. Remove these.

Method .hw_init is only called from platform driver probe, which is not
relevant, so don't set this either.

Link: https://lore.kernel.org/r/1606207594-196362-2-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-30 23:41:31 -05:00
Vaibhav Gupta
71c8f15e1d scsi: hisi_sas_v3_hw: Remove extra function calls for runtime pm
Both runtime_suspend_v3_hw() and runtime_resume_v3_hw() do nothing else but
invoke suspend_v3_hw() and resume_v3_hw() respectively. This is the case of
unnecessary function calls. To use those functions for runtime pm as well,
simply use UNIVERSAL_DEV_PM_OPS.

make -j$(nproc) W=1, with CONFIG_PM disabled, throws '-Wunused-function'
warning for runtime_suspend_v3_hw() and runtime_resume_v3_hw(). After
dropping those function definitions, the warning was thrown for
suspend_v3_hw() and resume_v3_hw(). Hence, mark them as '__maybe_unused'.

Link: https://lore.kernel.org/r/20201102164730.324035-15-vaibhavgupta40@gmail.com
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-25 23:14:31 -05:00
Vaibhav Gupta
027e508aea scsi: hisi_sas_v3_hw: Don't use PCI helper functions
Drivers using new-framework/generic-framework should not handle standard
power management operations. These operations were performed by legacy
framework through PCI helper functions like pci_save/restore_state(),
pci_set_power_state(), etc.

Drivers should not use them now.

Link: https://lore.kernel.org/r/20201102164730.324035-14-vaibhavgupta40@gmail.com
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-25 23:14:31 -05:00
Vaibhav Gupta
17b5e4d148 scsi: hisi_sas_v3_hw: Drop PCI Wakeup calls from .resume
The driver calls pci_enable_wake(...., false) in hisi_sas_v3_resume(), and
there is no corresponding pci_enable_wake(...., true) in
hisi_sas_v3_suspend(). Either it should do enable-wake the device in
.suspend() or should not invoke pci_enable_wake() at all.

Concluding that this driver doesn't support enable-wake and PCI core calls
pci_enable_wake(pci_dev, PCI_D0, false) during resume, drop it from
hisi_sas_v3_resume().

Link: https://lore.kernel.org/r/20201102164730.324035-13-vaibhavgupta40@gmail.com
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-25 23:14:31 -05:00
John Garry
fab09aaee8 scsi: hisi_sas: Stop using queue #0 always for v2 hw
In commit 8d98416a55 ("scsi: hisi_sas: Switch v3 hw to MQ"), the dispatch
function was changed to choose the delivery queue based on the request tag
HW queue index.

This heavily degrades performance for v2 hw, since the HW queues are not
exposed there, and, as such, HW queue #0 is used for every command.

Revert to previous behaviour for when nr_hw_queues is not set, that being
to choose the HW queue based on target device index.

Link: https://lore.kernel.org/r/1602750425-240341-1-git-send-email-john.garry@huawei.com
Fixes: 8d98416a55 ("scsi: hisi_sas: Switch v3 hw to MQ")
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-26 17:34:27 -04:00
Linus Torvalds
55e0500eb5 SCSI misc on 20201013
This series consists of the usual driver updates (ufs, qla2xxx, tcmu,
 ibmvfc, lpfc, smartpqi, hisi_sas, qedi, qedf, mpt3sas) and minor bug
 fixes.  There are only three core changes: adding sense codes,
 cleaning up noretry and adding an option for limitless retries.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX4YulyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaZDAQCT7rwG
 UEZYHgYkU9EX9ERVBQM0SW4mLrxf3g3P5ioJsAEAtkclCM4QsIOP+MIPjIa0EyUY
 khu0kcrmeFR2YwA8zhw=
 =4w4S
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "The usual driver updates (ufs, qla2xxx, tcmu, ibmvfc, lpfc, smartpqi,
  hisi_sas, qedi, qedf, mpt3sas) and minor bug fixes.

  There are only three core changes: adding sense codes, cleaning up
  noretry and adding an option for limitless retries"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (226 commits)
  scsi: hisi_sas: Recover PHY state according to the status before reset
  scsi: hisi_sas: Filter out new PHY up events during suspend
  scsi: hisi_sas: Add device link between SCSI devices and hisi_hba
  scsi: hisi_sas: Add check for methods _PS0 and _PR0
  scsi: hisi_sas: Add controller runtime PM support for v3 hw
  scsi: hisi_sas: Switch to new framework to support suspend and resume
  scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()
  scsi: qedf: Remove redundant assignment to variable 'rc'
  scsi: lpfc: Remove unneeded variable 'status' in lpfc_fcp_cpu_map_store()
  scsi: snic: Convert to use DEFINE_SEQ_ATTRIBUTE macro
  scsi: qla4xxx: Delete unneeded variable 'status' in qla4xxx_process_ddb_changed
  scsi: sun_esp: Use module_platform_driver to simplify the code
  scsi: sun3x_esp: Use module_platform_driver to simplify the code
  scsi: sni_53c710: Use module_platform_driver to simplify the code
  scsi: qlogicpti: Use module_platform_driver to simplify the code
  scsi: mac_esp: Use module_platform_driver to simplify the code
  scsi: jazz_esp: Use module_platform_driver to simplify the code
  scsi: mvumi: Fix error return in mvumi_io_attach()
  scsi: lpfc: Drop nodelist reference on error in lpfc_gen_req()
  scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs()
  ...
2020-10-14 15:15:35 -07:00
Xiang Chen
69f4ec1edb scsi: hisi_sas: Recover PHY state according to the status before reset
Currently the PHY state is set according to the state of the PHYs after
reset. This is invalid as the PHYs are already re-initialized.

Set PHY state according to the state before the reset instead of after.

Link: https://lore.kernel.org/r/1601649038-25534-8-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06 20:47:06 -04:00
Xiang Chen
b14a37e011 scsi: hisi_sas: Filter out new PHY up events during suspend
Currently sas_resume_ha() is called while resuming the controller to wait
for all suspended PHYs to come up and all the libsas events to be
completed.

There is a scenario which will cause task hung: For direct attach with two
disks connected with two PHYs, disable phy0 before suspending the disk on
phy1 and the controller, then enable phy0 and resume the controller, and
task hung occurs as follows:

[  591.901463] hisi_sas_v3_hw 0000:b4:02.0: resuming from operating state [D0]
[  593.113525] hisi_sas_v3_hw 0000:b4:02.0: neither _PS0 nor _PR0 is defined
[  593.120301] hisi_sas_v3_hw 0000:b4:02.0: waiting up to 25 seconds for 1 phy to resume
[  593.120836] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata)
[  593.134680] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy1 link_rate=10(sata)
[  593.134733] sas: phy-2:0 added to port-2:0, phy_mask:0x1 (5000000000000200)
[  593.148350] sas: DOING DISCOVERY on port 0, pid:948
[  593.153227] hisi_sas_v3_hw 0000:b4:02.0: dev[3:5] found
[  593.159840] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
[  593.165663] sas: ata7: end_device-2:0: dev error handler
[  593.165730] sas: ata2: end_device-2:1: dev error handler
[  593.172532] hisi_sas_v3_hw 0000:b4:02.0: phydown: phy0 phy_state=0x2
[  593.182570] hisi_sas_v3_hw 0000:b4:02.0: ignore flutter phy0 down
[  593.331277] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata)
[  593.498956] ata7.00: ATA-11: SAMSUNG MZ7LH960HAJR-00005, HXT7404Q, max UDMA/133
[  593.506235] ata7.00: 1875385008 sectors, multi 16: LBA48 NCQ (depth 32)
[  593.514295] ata7.00: configured for UDMA/133
[  593.518557] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[  593.528613] sas: ata7: end_device-2:0: model:SAMSUNG MZ7LH960HAJR-00005
serial:S45NNA0M712225
[  593.537520] device_link_add 316: dev=2:0:2:0 supplier:2 consumer:0
[  593.543674] device_link_add 324
[  593.546801] device_link_add 352
[  593.549930] device_link_add 406
[  593.553058] device_link_add 440: dev=2:0:2:0 supplier:2 consumer:0
[  593.559208] device_link_add 444
[  593.562335] device_link_add 455
[  593.565517] scsi 2:0:2:0: Direct-Access     ATA      SAMSUNG MZ7LH960 404Q PQ: 0
ANSI: 5
[  620.057464]  phy-2:1: resume timeout
[  738.841445] INFO: task kworker/u256:0:8 blocked for more than 120 seconds.
[  738.848295]       Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744
[  738.854361] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  738.862155] kworker/u256:0  D    0     8      2 0x00000028
[  738.867626] Workqueue: 0000:b4:02.0_event_q sas_port_event_worker
[  738.873693] Call trace:
[  738.876133]  __switch_to+0xf4/0x148
[  738.879613]  __schedule+0x270/0x5d8
[  738.883091]  schedule+0x78/0x110
[  738.886307]  schedule_timeout+0x1ac/0x280
[  738.890299]  wait_for_completion+0x94/0x138
[  738.894472]  flush_workqueue+0x114/0x438
[  738.898377]  sas_porte_bytes_dmaed+0x400/0x500
[  738.902801]  sas_port_event_worker+0x28/0x40
[  738.907053]  process_one_work+0x1e8/0x360
[  738.911046]  worker_thread+0x44/0x478
[  738.914698]  kthread+0x150/0x158
[  738.917915]  ret_from_fork+0x10/0x1c
[  738.921534] INFO: task kworker/u256:1:948 blocked for more than 120 seconds.
[  738.928550]       Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744
[  738.934614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  738.942408] kworker/u256:1  D    0   948      2 0x00000028
[  738.947873] Workqueue: 0000:b4:02.0_disco_q sas_discover_domain
[  738.953766] Call trace:
[  738.956203]  __switch_to+0xf4/0x148
[  738.959678]  __schedule+0x270/0x5d8
[  738.963152]  schedule+0x78/0x110
[  738.966368]  rpm_resume+0xcc/0x550
[  738.969757]  __pm_runtime_resume+0x3c/0x88
[  738.973836]  rpm_get_suppliers+0x50/0x148
[  738.977829]  __pm_runtime_set_status+0x124/0x2f0
[  738.982427]  scsi_sysfs_add_sdev+0x1a0/0x2a8
[  738.986679]  scsi_probe_and_add_lun+0x888/0xab0
[  738.991190]  __scsi_scan_target+0xec/0x520
[  738.995268]  scsi_scan_target+0x11c/0x128
[  738.999261]  sas_rphy_add+0x15c/0x1e8
[  739.002907]  sas_probe_devices+0xe4/0x150
[  739.006899]  sas_discover_domain+0x33c/0x588
[  739.011150]  process_one_work+0x1e8/0x360
[  739.015143]  worker_thread+0x44/0x478
[  739.018789]  kthread+0x150/0x158
[  739.022003]  ret_from_fork+0x10/0x1c
...

If an extra phy0 up happens during resume of the SAS controller, it will
emit a new libsas event (event PORTE_BYTES_DMAED and event
DISCE_DISCOVER_DOMAIN). We will call function scsi_sysfs_add_sdev() in
event DISCE_DISCOVER_DOMAIN, which will call __pm_runtime_set_status() to
resume supplier (host controller). For runtime PM core, if device is in the
resuming state, the later resume request of the device will wait for
previous resume request to complete synchronously. At that point in time
the state of the controller is still resuming as it waits for all libsas
events to be completed, while libsas event DISCE_DISCOVER_DOMAIN is blocked
as the state of the controller is resuming which causes a deadlock.

To avoid the issue, filter out new PHY up events while the controller is
suspended.

Link: https://lore.kernel.org/r/1601649038-25534-7-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06 20:47:06 -04:00
Xiang Chen
16fd4a7c59 scsi: hisi_sas: Add device link between SCSI devices and hisi_hba
Runtime PM of SCSI devices is already supported in SCSI layer, we can
suspend/resume every SCSI device separately. But if there is no link
between hisi_hba and SCSI devices or SCSI targets it will cause issues if
the controller is suspended while SCSI devices are still resuming.  Only
when all the SCSI devices under the controller are suspended, the
controller can be suspended. Add the device link between SCSI devices
and the controller.

Link: https://lore.kernel.org/r/1601649038-25534-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06 20:47:06 -04:00
Xiang Chen
e06596d500 scsi: hisi_sas: Add check for methods _PS0 and _PR0
To support system suspend/resume or runtime suspend/resume, need to use the
function pci_set_power_state() to change the power state which requires at
least method _PS0 or _PR0 be filled by platform for v3 hw. So check whether
the method is supported, if not, print a warning.

A Kconfig dependency is added as there is no stub for
acpi_device_power_manageable().

Link: https://lore.kernel.org/r/1601649038-25534-5-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06 20:47:06 -04:00
Xiang Chen
65ff4aef7e scsi: hisi_sas: Add controller runtime PM support for v3 hw
Add controller runtime PM support for v3 hw.

Link: https://lore.kernel.org/r/1601649038-25534-4-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06 20:47:06 -04:00
Xiang Chen
6c459ea154 scsi: hisi_sas: Switch to new framework to support suspend and resume
For v3 hw we will add support for runtime PM which is only supported in new
framework. Legacy PM support and new framework are not allowed to be used
together. Switch to new framework to support suspend and resume.

Link: https://lore.kernel.org/r/1601649038-25534-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06 20:47:06 -04:00
Luo Jiaxing
7f054da773 scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()
A call trace is observed when running function level reset with online CPUs
less than 16 and MSI auto-affinity enabled.

[16538.348038] Call trace:
[16538.348422]  pci_irq_vector+0x98/0xc0
[16538.348947]  disable_host_v3_hw+0x8c/0x288 [hisi_sas_v3_hw]
[16538.349706]  hisi_sas_reset_prepare_v3_hw+0x60/0x88 [hisi_sas_v3_hw]
[16538.350631]  pci_dev_save_and_disable+0x38/0x68
[16538.351290]  pci_reset_function+0x44/0x88
[16538.351846]  reset_store+0x6c/0xb8
[16538.352429]  dev_attr_store+0x44/0x60
[16538.353035]  sysfs_kf_write+0x58/0x80
[16538.353558]  kernfs_fop_write+0x140/0x230
[16538.354175]  __vfs_write+0x48/0x80
[16538.354675]  vfs_write+0xb8/0x1d8
[16538.355145]  ksys_write+0x74/0xf8
[16538.355615]  __arm64_sys_write+0x24/0x30
[16538.356240]  el0_svc_common.constprop.4+0x80/0x1f0
[16538.356905]  do_el0_svc+0x2c/0x38
[16538.357408]  el0_svc+0x14/0x40
[16538.357848]  el0_sync_handler+0xbc/0x2ec
[16538.358388]  el0_sync+0x140/0x180

The reason is that if we use pci_alloc_irq_vectors_affinity() to allocate
IRQs, the number of CQ IRQs can only be less than or equal to the number of
online CPUs, but we use hisi_hba->queue_count (always 16) to iterate during
interrupt_disable_v3_hw().

Use hisi_hba->cq_nvecs to replace hisi_hba->queue_count to avoid
synchronize IRQ on a CPU which does not exist.

Link: https://lore.kernel.org/r/1601649038-25534-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-06 20:47:06 -04:00
John Garry
8d98416a55 scsi: hisi_sas: Switch v3 hw to MQ
Now that the block layer provides a shared tag, we can switch the driver
to expose all HW queues.

Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-06 08:33:44 -06:00
Luo Jiaxing
26f84f9bc3 scsi: hisi_sas: Code style cleanup
Remove extra blank lines and add spaces around operators.

Link: https://lore.kernel.org/r/1598958790-232272-9-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-02 22:49:08 -04:00
Xiang Chen
b601577df6 scsi: hisi_sas: Add missing newlines
Newline is missing from some printk() statements. Add them.

Link: https://lore.kernel.org/r/1598958790-232272-8-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-02 22:49:08 -04:00
Luo Jiaxing
981cc23e74 scsi: hisi_sas: Add BIST support for fixed code pattern
Through the new debugfs interface the user can select fixed code
patterns. Add two new interfaces fixed_code and fixed_code1.

Link: https://lore.kernel.org/r/1598958790-232272-7-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-02 22:49:08 -04:00
Luo Jiaxing
2c4d582322 scsi: hisi_sas: Add BIST support for phy FFE
Add BIST support for phy FFE (Feed forward equalizer) setting. The user can
configure FFE through the new debugfs interface.

FFE is a parameter used for link layer control. It will affect the link
quality between the SAS controller and the backplane. In the BIST test, the
FFE interface is provided to assist board testers in optimizing link
parameters.

The modification of the FFE parameter will affect the test after BIST or
the normal running of the board. The user should save the initial FFE
values and restore them after BIST test is complete.

Link: https://lore.kernel.org/r/1598958790-232272-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-02 22:49:08 -04:00
Luo Jiaxing
ca06f2cd01 scsi: hisi_sas: Make phy index variable name consistent
We use "phy_id" to identify phy in the BIST code but the rest of code
always uses "phy_no". Change it for consistency.

Link: https://lore.kernel.org/r/1598958790-232272-5-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-02 22:49:08 -04:00
Luo Jiaxing
caeddc0453 scsi: hisi_sas: Do not modify upper fields of PROG_PHY_LINK_RATE reg
When updating PROG_PHY_LINK_RATE to set linkrate for a phy we used a
hard-coded initial value instead of getting the current value from the
register. The assumption was that this register would not be modified, but
in fact it was partially modified in a new version of hardware. The
hard-coded value we used changed the default value of the register to a an
incorrect setting and as a result the SAS controller could not change
linkrate for the phy.

Delete hard-coded value and always read the latest value of register before
updating it.

Link: https://lore.kernel.org/r/1598958790-232272-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-02 22:49:08 -04:00
Luo Jiaxing
4b3a1f1fed scsi: hisi_sas: Modify macro name for OOB phy linkrate
The macro for OOB phy linkrate is named CFG_PROG_PHY_LINK_RATE_* but that
is inaccurate. For clarification, include OOB in macro name.

Link: https://lore.kernel.org/r/1598958790-232272-3-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-02 22:49:07 -04:00
Xiang Chen
847e835529 scsi: hisi_sas: Avoid accessing to SSP task for SMP I/Os
hisi_sas_slot_task_free() attempts to dereference SSP task for non-ATA
tasks. If the task is SMP, the code may reference the wrong structure
although this may not cause any problems.

To avoid this, only access to SSP task when slot->n_elem_dif is not 0 which
indicates this is an SSP task.

Link: https://lore.kernel.org/r/1598958790-232272-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-02 22:49:07 -04:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Linus Torvalds
dfdf16ecfd SCSI misc on 20200806
This series consists of the usual driver updates (ufs, qla2xxx, tcmu,
 lpfc, hpsa, zfcp, scsi_debug) and minor bug fixes.  We also have a
 huge docbook fix update like most other subsystems and no major update
 to the core (the few non trivial updates are either minor fixes or
 removing an unused feature [scsi_sdb_cache]).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXyxq1yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSoAAQChZ4i8
 ZqYW3pL33JO3fA8vdjvLuyC489Hj4wzIsl3/bQEAxYyM6BSLvMoLWR2Plq/JmTLm
 4W/LDptarpTiDI3NuDc=
 =4b0W
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This consists of the usual driver updates (ufs, qla2xxx, tcmu, lpfc,
  hpsa, zfcp, scsi_debug) and minor bug fixes.

  We also have a huge docbook fix update like most other subsystems and
  no major update to the core (the few non trivial updates are either
  minor fixes or removing an unused feature [scsi_sdb_cache])"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (307 commits)
  scsi: scsi_transport_srp: Sanitize scsi_target_block/unblock sequences
  scsi: ufs-mediatek: Apply DELAY_AFTER_LPM quirk to Micron devices
  scsi: ufs: Introduce device quirk "DELAY_AFTER_LPM"
  scsi: virtio-scsi: Correctly handle the case where all LUNs are unplugged
  scsi: scsi_debug: Implement tur_ms_to_ready parameter
  scsi: scsi_debug: Fix request sense
  scsi: lpfc: Fix typo in comment for ULP
  scsi: ufs-mediatek: Prevent LPM operation on undeclared VCC
  scsi: iscsi: Do not put host in iscsi_set_flashnode_param()
  scsi: hpsa: Correct ctrl queue depth
  scsi: target: tcmu: Make TMR notification optional
  scsi: target: tcmu: Implement tmr_notify callback
  scsi: target: tcmu: Fix and simplify timeout handling
  scsi: target: tcmu: Factor out new helper ring_insert_padding
  scsi: target: tcmu: Do not queue aborted commands
  scsi: target: tcmu: Use priv pointer in se_cmd
  scsi: target: Add tmr_notify backend function
  scsi: target: Modify core_tmr_abort_task()
  scsi: target: iscsi: Fix inconsistent debug message
  scsi: target: iscsi: Fix login error when receiving
  ...
2020-08-06 16:50:07 -07:00
John Garry
3d570a28ee scsi: hisi_sas: Remove one kerneldoc comment
The comment for interrupt_init_v2_hw() should not be a kerneldoc
comment. Remove it.

Link: https://lore.kernel.org/r/1594627471-235395-3-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-13 23:32:29 -04:00
Luo Jiaxing
05d91b557a scsi: hisi_sas: Directly trigger SCSI error handling for completion errors
Abort failed commands in completion path. This avoids having to wait for
block layer timeouts and triggering the SCSI error handling thread.

Link: https://lore.kernel.org/r/1594627471-235395-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-13 23:32:28 -04:00
Christoph Hellwig
b8f1d1e058 scsi: Wire up ata_scsi_dma_need_drain for SAS HBA drivers
We need ata_scsi_dma_need_drain for all drivers wired up to drive ATAPI
devices through libata.  That also includes the SAS HBA drivers in addition
to native libata HBA drivers.

Link: https://lore.kernel.org/r/20200615064624.37317-3-hch@lst.de
Fixes: cc97923a5b ("block: move dma drain handling to scsi")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-06-15 23:40:43 -04:00
Linus Torvalds
818dbde78e SCSI misc on 20200605
This series consists of the usual driver updates (qla2xxx, ufs, zfcp,
 target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host
 of other minor updates.  There are no major core changes in this
 series apart from a refactoring in scsi_lib.c.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXtq5QyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXyGAQCipTWx
 7kHKHZBCVTU133bADt3+SstLrAm8PKZEXMnP9wEAzu4QkkW8URxEDRrpu7qk5gbA
 9M/KyqvfRtTH7+BSK7M=
 =J6aO
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 :This series consists of the usual driver updates (qla2xxx, ufs, zfcp,
  target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host
  of other minor updates.

  There are no major core changes in this series apart from a
  refactoring in scsi_lib.c"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (207 commits)
  scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes
  scsi: cxgb3i: Fix some leaks in init_act_open()
  scsi: ibmvscsi: Make some functions static
  scsi: iscsi: Fix deadlock on recovery path during GFP_IO reclaim
  scsi: ufs: Fix WriteBooster flush during runtime suspend
  scsi: ufs: Fix index of attributes query for WriteBooster feature
  scsi: ufs: Allow WriteBooster on UFS 2.2 devices
  scsi: ufs: Remove unnecessary memset for dev_info
  scsi: ufs-qcom: Fix scheduling while atomic issue
  scsi: mpt3sas: Fix reply queue count in non RDPQ mode
  scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event
  scsi: target: tcmu: Fix a use after free in tcmu_check_expired_queue_cmd()
  scsi: vhost: Notify TCM about the maximum sg entries supported per command
  scsi: qla2xxx: Remove return value from qla_nvme_ls()
  scsi: qla2xxx: Remove an unused function
  scsi: iscsi: Register sysfs for iscsi workqueue
  scsi: scsi_debug: Parser tables and code interaction
  scsi: core: Refactor scsi_mq_setup_tags function
  scsi: core: Fix incorrect usage of shost_for_each_device
  scsi: qla2xxx: Fix endianness annotations in source files
  ...
2020-06-05 15:11:50 -07:00
John Garry
1cdee00442 scsi: hisi_sas: Stop returning error code from slot_complete_vX_hw()
The error codes are never checked, stop returning them.

Link: https://lore.kernel.org/r/1589552025-165012-5-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-19 21:23:59 -04:00
Luo Jiaxing
1a0efb55b2 scsi: hisi_sas: Add SAS_RAS_INTR0 to debugfs register name list
Register SAS_RAS_INTR0 can help us to figure out which ECC error has
occurred. This register is helpful to identify RAS issue, so we add it to
the list of debugfs register name list for easier retrieval.

Link: https://lore.kernel.org/r/1589552025-165012-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-19 21:23:58 -04:00
Luo Jiaxing
1e954d1f00 scsi: hisi_sas: Modify the commit information for DSM method
Make it clear that BIOS may modify some register settings.

Link: https://lore.kernel.org/r/1589552025-165012-3-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-19 21:23:58 -04:00
Luo Jiaxing
e16b9ed61e scsi: hisi_sas: Do not reset phy timer to wait for stray phy up
We found out that after phy up, the hardware reports another oob interrupt
but did not follow a phy up interrupt:

oob ready -> phy up -> DEV found -> oob read -> wait phy up -> timeout

We run link reset when wait phy up timeout, and it send a normal disk into
reset processing. So we made some circumvention action in the code, so that
this abnormal oob interrupt will not start the timer to wait for phy up.

Link: https://lore.kernel.org/r/1589552025-165012-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-19 21:23:57 -04:00
Jason Yan
55ce24b3bf scsi: hisi_sas: Display proc_name in sysfs
The 'proc_name' entry in sysfs for hisi_sas is 'null' now because it is not
initialized in scsi_host_template. It looks like:

[root@localhost ~]# cat /sys/class/scsi_host/host2/proc_name
(null)

While the other driver's entry looks like:

linux-vnMQMU:~ # cat /sys/class/scsi_host/host0/proc_name
megaraid_sas

Link: https://lore.kernel.org/r/20200512113258.30781-1-yanaijie@huawei.com
Cc: John Garry <john.garry@huawei.com>
Cc: Xiang Chen <chenxiang66@hisilicon.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-12 20:20:07 -04:00
YueHaibing
1d95b8a2d4 scsi: hisi_sas: Fix build error without SATA_HOST
If SATA_HOST is n, build fails:

drivers/scsi/hisi_sas/hisi_sas_main.o: In function
`hisi_sas_fill_ata_reset_cmd': hisi_sas_main.c:(.text+0x2500): undefined
reference to `ata_tf_to_fis'

Select SATA_HOST to fix this.

Link: https://lore.kernel.org/r/20200402085812.32948-1-yuehaibing@huawei.com
Fixes: bd322af15c ("ata: make SATA_PMP option selectable only if any SATA host driver is enabled")
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-13 13:28:13 -04:00
Luo Jiaxing
1e067dd8a3 scsi: hisi_sas: Use dev_err() in read_iost_itct_cache_v3_hw()
The print of pr_err() does not come with device information, so replace it
with dev_err(). Also improve the grammar in the message.

Link: https://lore.kernel.org/r/1583940144-230800-1-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-16 18:24:23 -04:00
John Garry
11e673206f scsi: hisi_sas: Rename hisi_sas_cq.pci_irq_mask
In future we will want to use hisi_sas_cq.pci_irq_mask for non-pci
interrupt masks, so rename to be more general.

Link: https://lore.kernel.org/r/1579522957-4393-7-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-20 19:31:14 -05:00
Luo Jiaxing
33c77c31b7 scsi: hisi_sas: Add prints for v3 hw interrupt converge and automatic affinity
Add prints to inform the user of enabled features.

Link: https://lore.kernel.org/r/1579522957-4393-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-20 19:31:14 -05:00
Luo Jiaxing
3cd2f3c35d scsi: hisi_sas: Modify the file permissions of trigger_dump to write only
The trigger_dump file is only used to manually trigger the dump, and did
not provide a read callback function for it, so its file permission
setting to 600 is wrong,and should be changed to 200.

Link: https://lore.kernel.org/r/1579522957-4393-5-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-20 19:31:14 -05:00
Luo Jiaxing
d2815fdf9a scsi: hisi_sas: Replace magic number when handle channel interrupt
We use magic number as offset and mask when handle channel interrupt, so
use macro to replace it.

Link: https://lore.kernel.org/r/1579522957-4393-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-20 19:31:14 -05:00
Xiang Chen
e9dc5e11c9 scsi: hisi_sas: replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock
After changing tasklet to workqueue or threaded irq, some critical
resources are only used on threads (not in interrupt or bottom half of
interrupt), so replace spin_lock_irqsave/spin_unlock_restore with
spin_lock/spin_unlock to protect those critical resources.

Link: https://lore.kernel.org/r/1579522957-4393-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-20 19:31:13 -05:00
Xiang Chen
81f338e970 scsi: hisi_sas: use threaded irq to process CQ interrupts
Currently IRQ_EFFECTIVE_AFF_MASK is enabled for ARM_GIC and ARM_GIC3, so it
only allows a single target CPU in the affinity mask to process interrupts
and also interrupt thread, and the performance of using threaded irq is
almost the same as tasklet. But if the config is not enabled, the interrupt
thread will be allowed all the CPUs in the affinity mask. At that situation
it improves the performance (about 20%).

Note: IRQ_EFFECTIVE_AFF_MASK is configured differently for different
architecture chip, and it seems to be better to make it be configured
easily.

Link: https://lore.kernel.org/r/1579522957-4393-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-20 19:31:13 -05:00
Arnd Bergmann
75c0b0e118 compat_ioctl: scsi: handle HDIO commands from drivers
The ata_sas_scsi_ioctl() function implements a number of HDIO_* commands
for SCSI devices, it is used by all libata drivers as well as a few
drivers that support SAS attached SATA drives.

The only command that is not safe for compat ioctls here is
HDIO_GET_32BIT. Change the implementation to check for in_compat_syscall()
in order to do both cases correctly, and change all callers to use it
as both native and compat callback pointers, including the indirect
callers through sas_ioctl and ata_scsi_ioctl.

Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-01-03 09:42:52 +01:00
John Garry
964231aa0c scsi: hisi_sas: Stop converting a bool into a bool
The !! operator on a bool is pointless, so remove an example in
hisi_sas_rescan_topology().

Link: https://lore.kernel.org/r/1573551059-107873-5-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:34 -05:00
Xiang Chen
7c0ecd40c3 scsi: hisi_sas: Relocate call to hisi_sas_debugfs_exit()
Currently we call function hisi_sas_debugfs_exit() to remove debugfs_dir
before freeing interrupt irqs and destroying workqueue in the driver remove
path.

If a dump is triggered before function hisi_sas_debugfs_exit() but
debugfs_work may be called after it, so it may refer to already removed
debugfs_dir which will cause NULL pointer dereference.

To avoid it, put function hisi_sas_debugfs_exit() after free_irqs and
destroy workqueue when removing hisi_sas driver.

Link: https://lore.kernel.org/r/1573551059-107873-4-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:34 -05:00
Xiang Chen
547fde8b5a scsi: hisi_sas: Return directly if init hardware failed
Need to return directly if init hardware failed.

Fixes: 73a4925d15 ("scsi: hisi_sas: Update all the registers after suspend and resume")
Link: https://lore.kernel.org/r/1573551059-107873-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:34 -05:00
Xiang Chen
8c39673d54 scsi: hisi_sas: Check sas_port before using it
Need to check the structure sas_port before using it.

Link: https://lore.kernel.org/r/1573551059-107873-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
Luo Jiaxing
f873b66119 scsi: hisi_sas: Record the phy down event in debugfs
The number of phy down reflects the quality of the link between SAS
controller and disk. In order to allow the user to confirm the link quality
of the system, we record the number of phy down for each phy.

The user can check the current phy down count by reading the debugfs file
corresponding to the specific phy, or clear the phy down count by writing 0
to the debugfs file.

Link: https://lore.kernel.org/r/1571926105-74636-19-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
cabe7c10c9 scsi: hisi_sas: Delete the debugfs folder of hisi_sas when the probe fails
Although if the debugfs initialization fails, we will delete the debugfs
folder of hisi_sas, but we did not consider the scenario where debugfs was
successfully initialized, but the probe failed for other reasons. We found
out that hisi_sas folder is still remain after the probe failed.

When probe fail, we should delete debugfs folder to avoid the above issue.

Link: https://lore.kernel.org/r/1571926105-74636-18-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
8f6432986e scsi: hisi_sas: Add ability to have multiple debugfs dumps
We use the module parameter debugfs_dump_count to manage the upper limit of
the memory block for multiple dumps.

Link: https://lore.kernel.org/r/1571926105-74636-17-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
905ab01faf scsi: hisi_sas: Add module parameter for debugfs dump count
We still only use dump index #0 however.

Link: https://lore.kernel.org/r/1571926105-74636-16-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
a70e33eae3 scsi: hisi_sas: Allocate memory for multiple dumps of debugfs
We add multiple dumps for debugfs, but only allocate memory this time and
only dump #0.

Link: https://lore.kernel.org/r/1571926105-74636-15-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:15 -04:00
Luo Jiaxing
357e4fc7a9 scsi: hisi_sas: Add debugfs file structure for ITCT cache
Create a file structure which was used to save the memory address for
ITCT cache at debugfs. This structure is bound to the corresponding debugfs
file, it can help callback function of debugfs file to get what it needs.

Link: https://lore.kernel.org/r/1571926105-74636-14-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
b714dd8f36 scsi: hisi_sas: Add debugfs file structure for IOST cache
Create a file structure which was used to save the memory address for IOST
cache at debugfs. This structure is bound to the corresponding debugfs
file, it can help callback function of debugfs file to get what it needs.

Link: https://lore.kernel.org/r/1571926105-74636-13-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
0161d55f23 scsi: hisi_sas: Add debugfs file structure for ITCT
Create a file structure which was used to save the memory address for ITCT
at debugfs. This structure is bound to the corresponding debugfs file, it
can help callback function of debugfs file to get what it needs.

Link: https://lore.kernel.org/r/1571926105-74636-12-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
e15f2e2dff scsi: hisi_sas: Add debugfs file structure for IOST
Create a file structure which was used to save the memory address for IOST
at debugfs. This structure is bound to the corresponding debugfs file, it
can help callback function of debugfs file to get what it needs.

Link: https://lore.kernel.org/r/1571926105-74636-11-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
1f66e1fd26 scsi: hisi_sas: Add debugfs file structure for port
Create a file structure which was used to save the memory address and phy
pointer for port at debugfs. This structure is bound to the corresponding
debugfs file, it can help callback function of debugfs file to get what it
need.

Link: https://lore.kernel.org/r/1571926105-74636-10-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
c611639810 scsi: hisi_sas: Add debugfs file structure for registers
Create a file structure which was used to save the memory address and
hisi_hba pointer for REGS at debugfs. This structure is bound to the
corresponding debugfs file, it can help callback function of debugfs file
to get what it need.

Link: https://lore.kernel.org/r/1571926105-74636-9-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
1b54c4db72 scsi: hisi_sas: Add debugfs file structure for DQ
Create a file structure which was used to save the memory address and DQ
pointer for DQ at debugfs. This structure is bound to the corresponding
debugfs file, it can help callback function of debugfs file to get what it
need.

Link: https://lore.kernel.org/r/1571926105-74636-8-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:14 -04:00
Luo Jiaxing
35ea630b2b scsi: hisi_sas: Add debugfs file structure for CQ
Create a file structure which was used to save the memory address and CQ
pointer for CQ at debugfs. This structure is bound to the corresponding
debugfs file, it can help callback function of debugfs file to get what it
need.

Link: https://lore.kernel.org/r/1571926105-74636-7-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:13 -04:00
Luo Jiaxing
d28ed83b76 scsi: hisi_sas: Add timestamp for a debugfs dump
It's useful to know when the dump occurred, so add a timestamp file for
this.

Link: https://lore.kernel.org/r/1571926105-74636-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:13 -04:00
Xiang Chen
550c0d89d5 scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()
For IOs from upper layer, preemption may be disabled as it may be called by
function __blk_mq_delay_run_hw_queue which will call get_cpu() (it disables
preemption). So if flags HISI_SAS_REJECT_CMD_BIT is set in function
hisi_sas_task_exec(), it may disable preempt twice after down() and up()
which will cause following call trace:

BUG: scheduling while atomic: fio/60373/0x00000002
Call trace:
dump_backtrace+0x0/0x150
show_stack+0x24/0x30
dump_stack+0xa0/0xc4
__schedule_bug+0x68/0x88
__schedule+0x4b8/0x548
schedule+0x40/0xd0
schedule_timeout+0x200/0x378
__down+0x78/0xc8
down+0x54/0x70
hisi_sas_task_exec.isra.10+0x598/0x8d8 [hisi_sas_main]
hisi_sas_queue_command+0x28/0x38 [hisi_sas_main]
sas_queuecommand+0x168/0x1b0 [libsas]
scsi_queue_rq+0x2ac/0x980
blk_mq_dispatch_rq_list+0xb0/0x550
blk_mq_do_dispatch_sched+0x6c/0x110
blk_mq_sched_dispatch_requests+0x114/0x1d8
__blk_mq_run_hw_queue+0xb8/0x130
__blk_mq_delay_run_hw_queue+0x1c0/0x220
blk_mq_run_hw_queue+0xb0/0x128
blk_mq_sched_insert_requests+0xdc/0x208
blk_mq_flush_plug_list+0x1b4/0x3a0
blk_flush_plug_list+0xdc/0x110
blk_finish_plug+0x3c/0x50
blkdev_direct_IO+0x404/0x550
generic_file_read_iter+0x9c/0x848
blkdev_read_iter+0x50/0x78
aio_read+0xc8/0x170
io_submit_one+0x1fc/0x8d8
__arm64_sys_io_submit+0xdc/0x280
el0_svc_common.constprop.0+0xe0/0x1e0
el0_svc_handler+0x34/0x90
el0_svc+0x10/0x14
...

To solve the issue, check preemptible() to avoid disabling preempt multiple
when flag HISI_SAS_REJECT_CMD_BIT is set.

Link: https://lore.kernel.org/r/1571926105-74636-5-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:13 -04:00
Xiang Chen
8fa9a7bd30 scsi: hisi_sas: use wait_for_completion_timeout() when clearing ITCT
When injecting 2bit ecc errors, it will cause confusion inside SAS
controller which needs host reset to recover it. If a device is gone at the
same times inject 2bit ecc errors, we may not receive the ITCT interrupt so
it will wait for completion in clear_itct_v3_hw() all the time. And host
reset will also not occur because it can't require hisi_hba->sem, so the
system will be suspended.

To solve the issue, use wait_for_completion_timeout() instead of
wait_for_completion(), and also don't mark the gone device as
SAS_PHY_UNUSED when device gone.

Link: https://lore.kernel.org/r/1571926105-74636-4-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:13 -04:00
Xiang Chen
65a3b8bd56 scsi: hisi_sas: Set the BIST init value before enabling BIST
If set the BIST init value after enabling BIST, there may be still some few
error bits. According to the process, need to set the BIST init value
before enabling BIST.

Fixes: 97b151e758 ("scsi: hisi_sas: Add BIST support for phy loopback")
Link: https://lore.kernel.org/r/1571926105-74636-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:13 -04:00
Xiang Chen
35160421b6 scsi: hisi_sas: Don't create debugfs dump folder twice
Due to a merge error, we attempt to create 2x debugfs dump folders, which
fails:
[  861.101914] debugfs: Directory 'dump' with parent '0000:74:02.0'
already present!

This breaks the dump function.

To fix, remove the superfluous attempt to create the folder.

Fixes: 7ec7082c57 ("scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation")
Link: https://lore.kernel.org/r/1571926105-74636-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:31:13 -04:00
Martin K. Petersen
a3a8d13f62 Merge branch '5.4/scsi-fixes' into 5.5/scsi-queue
The qla2xxx driver updates for 5.5 depend on the fixes queued for
5.4.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-09 21:54:04 -04:00
Colin Ian King
b1000fcca1 scsi: hisi_sas: fix spelling mistake "digial" -> "digital"
There is a spelling mistake in literal string. Fix it.

Link: https://lore.kernel.org/r/20190916091706.32268-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30 22:57:53 -04:00
YueHaibing
4b6b1bb686 scsi: hisi_sas: Make three functions static
Fix sparse warnings:

drivers/scsi/hisi_sas/hisi_sas_main.c:3686:6:
 warning: symbol 'hisi_sas_debugfs_release' was not declared. Should it be static?
drivers/scsi/hisi_sas/hisi_sas_main.c:3708:5:
 warning: symbol 'hisi_sas_debugfs_alloc' was not declared. Should it be static?
drivers/scsi/hisi_sas/hisi_sas_main.c:3799:6:
 warning: symbol 'hisi_sas_debugfs_bist_init' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20190923054035.19036-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-23 23:09:42 -04:00
Xiang Chen
e74006edd0 scsi: hisi_sas: Fix the conflict between device gone and host reset
When device gone, it will check whether it is during reset, if not, it will
send internal task abort. Before internal task abort returned, reset
begins, and it will check whether SAS_PHY_UNUSED is set, if not, it will
call hisi_sas_init_device(), but at that time domain_device may already be
freed or part of it is freed, so it may referenece null pointer in
hisi_sas_init_device(). It may occur as follows:

    thread0				thread1
hisi_sas_dev_gone()
    check whether in RESET(no)
    internal task abort
				    reset prep
				    soft_reset
				    ... (part of reset_done)
    internal task abort failed
    release resource anyway
    clear_itct
    device->lldd_dev=NULL
				    hisi_sas_reset_init_all_device
					check sas_dev->dev_type is SAS_PHY_UNUSED and
					!device
    set dev_type SAS_PHY_UNUSED
    sas_free_device
					hisi_sas_init_device
					...

Semaphore hisi_hba.sema is used to sync the processes of device gone and
host reset.

To solve the issue, expand the scope that semaphore protects and let them
never occur together.

And also some places will check whether domain_device is NULL to judge
whether the device is gone. So when device gone, need to clear
sas_dev->sas_device.

Link: https://lore.kernel.org/r/1567774537-20003-14-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:57 -04:00
Xiang Chen
97b151e758 scsi: hisi_sas: Add BIST support for phy loopback
Add BIST (built in self test) support for phy loopback.

Through the new debugfs interface, the user can configure loopback
mode/linkrate/phy id/code mode before enabling it. And also user can
enable/disable BIST function.

Link: https://lore.kernel.org/r/1567774537-20003-13-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:57 -04:00
Luo Jiaxing
7ec7082c57 scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation
We extract the code of memory allocate and construct an new function for
it. We think it's convenient for subsequent optimization.

Link: https://lore.kernel.org/r/1567774537-20003-12-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:57 -04:00
Luo Jiaxing
4bc058097a scsi: hisi_sas: Remove some unused function arguments
Some function arguments are unused, so remove them.

Also move the timeout print in for wait_cmds_complete_timeout_vX_hw()
callsites into that same function.

Link: https://lore.kernel.org/r/1567774537-20003-11-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Xiang Chen
27f22723c3 scsi: hisi_sas: Remove redundant work declaration
Remove redundant work declaration in HISI_SAS_DECLARE_RST_WORK_ON_STACK

Link: https://lore.kernel.org/r/1567774537-20003-10-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Luo Jiaxing
971b59443f scsi: hisi_sas: Remove hisi_sas_hw.slot_complete
We never call hisi_sas_hw.slot_complete, so remove it.

Link: https://lore.kernel.org/r/1567774537-20003-9-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Xiang Chen
435a05cf8c scsi: hisi_sas: Assign NCQ tag for all NCQ commands
Currently the NCQ tag is only assigned for FPDMA READ and FPDMA WRITE
commands, and for other NCQ commands (such as FPDMA SEND), their NCQ tags
are set in the delivery command to 0.

So for all the NCQ commands, we also need to assign normal NCQ tag for
them, so drop the command type check in hisi_sas_get_ncq_tag() [drop
hisi_sas_get_ncq_tag() altogether actually], and always use the ATA command
NCQ tag when appropriate.

Link: https://lore.kernel.org/r/1567774537-20003-8-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Xiang Chen
73a4925d15 scsi: hisi_sas: Update all the registers after suspend and resume
After suspend and resume, the HW registers will be set back to their
initial value. We use init_reg_v3_hw() to set some registers, but some
registers are set via firmware in ACPI "_RST" method, so add reset handler
before init_reg_v3_hw().

Link: https://lore.kernel.org/r/1567774537-20003-7-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Xiang Chen
b45e05aa5d scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device
When init device for SAS disks, it will send TMF IO to clear disks. At that
time TMF IO is broken by some operations such as injecting controller reset
from HW RAs event, the TMF IO will be timeout, and at last device will be
gone. Print is as followed:

hisi_sas_v3_hw 0000:74:02.0: dev[240:1] found
...
hisi_sas_v3_hw 0000:74:02.0: controller resetting...
hisi_sas_v3_hw 0000:74:02.0: phyup: phy7 link_rate=10(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy0 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy1 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy2 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy3 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy6 link_rate=10(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy5 link_rate=11
hisi_sas_v3_hw 0000:74:02.0: phyup: phy4 link_rate=11
hisi_sas_v3_hw 0000:74:02.0: controller reset complete
hisi_sas_v3_hw 0000:74:02.0: abort tmf: TMF task timeout and not done
hisi_sas_v3_hw 0000:74:02.0: dev[240:1] is gone
sas: driver on host 0000:74:02.0 cannot handle device 5000c500a75a860d,
error:5

To improve the reliability, retry TMF IO max of 3 times for SAS disks which
is the same as softreset does.

Link: https://lore.kernel.org/r/1567774537-20003-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Luo Jiaxing
76dd768b44 scsi: hisi_sas: Remove sleep after issue phy reset if sas_smp_phy_control() fails
At expander environment, we delay after issue phy reset to wait for
hardware to handle phy reset. But if sas_smp_phy_control() fails, the
delay is unnecessary so remove it.

Link: https://lore.kernel.org/r/1567774537-20003-5-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:56 -04:00
Luo Jiaxing
c2bae4f7d7 scsi: hisi_sas: Directly return when running I_T_nexus reset if phy disabled
At hisi_sas_debug_I_T_nexus_reset(), we call sas_phy_reset() to reset a
phy. But if the phy is disabled, sas_phy_reset() will directly return
-ENODEV without issue a phy reset request.

If so, We can directly return -ENODEV to libsas before issue a phy
reset.

Link: https://lore.kernel.org/r/1567774537-20003-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:55 -04:00
Luo Jiaxing
af01b2b924 scsi: hisi_sas: Use true/false as input parameter of sas_phy_reset()
When calling sas_phy_reset(), we need to specify whether the reset type
is hard reset or link reset - use true/false for clarity.

Link: https://lore.kernel.org/r/1567774537-20003-3-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:55 -04:00
Luo Jiaxing
7105e68afa scsi: hisi_sas: add debugfs auto-trigger for internal abort time out
This trigger is add at _hisi_sas_internal_task_abort()

Link: https://lore.kernel.org/r/1567774537-20003-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-10 22:28:55 -04:00
YueHaibing
c0c1a71e95 scsi: hisi_sas: use devm_platform_ioremap_resource() to simplify code
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.

Link: https://lore.kernel.org/r/20190904130256.24704-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-07 16:40:56 -04:00
zhengbin
328bc6debf scsi: hisi_sas: remove set but not used variable 'irq_value'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/hisi_sas/hisi_sas_v1_hw.c: In function cq_interrupt_v1_hw:
drivers/scsi/hisi_sas/hisi_sas_v1_hw.c:1542:6: warning: variable irq_value set but not used [-Wunused-but-set-variable]

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-29 18:01:58 -04:00
Luo Jiaxing
a5ac1f5d9a scsi: hisi_sas: Consolidate internal abort calls in LU reset operation
In hisi_sas_lu_reset(), we call internal abort for SAS and SATA device
codepaths -> consolidate into a single call.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:16 -04:00
Xiang Chen
e7513f666b scsi: hisi_sas: replace "%p" with "%pK"
The format specifier "%p" can leak kernel address, and use "%pK" instead.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Xiang Chen
a07b48766c scsi: hisi_sas: Remove some unnecessary code
Remove some unnecessary code, including:

 - Explicit zeroing of memory allocated for dmam_alloc_coherent()

 - Some duplicated code

 - Some redundant masking

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Luo Jiaxing
7bf18e849d scsi: hisi_sas: Modify return type of debugfs functions
For functions which always return 0, which is never checked, make to return
void.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Xiang Chen
e16963f378 scsi: hisi_sas: Drop free_irq() when devm_request_irq() failed
It will free irq automatically if devm_request_irq() failed, so drop
free_irq() if devm_request_irq() failed.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
John Garry
5f6c32d7ce scsi: hisi_sas: Drop SMP resp frame DMA mapping
The SMP frame response is written to the command table and not the SMP
response pointer from libsas, so don't bother DMA mapping (and unmapping)
the SMP response from libsas.

Suggested-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
John Garry
1c003146c6 scsi: hisi_sas: Drop kmap_atomic() in SMP command completion
The call to kmap_atomic() in the SMP command completion code is
unnecessary, since kmap() is only really concerned with highmem, which is
not relevant on arm64. The controller only finds itself in arm64 systems.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Xiang Chen
599aefc81e scsi: hisi_sas: Make slot buf minimum allocation of PAGE_SIZE
For a system with PAGE_SIZE of 16K or 64K, the size every time we want to
alloc may be small like 4K, but for function dmam_alloc_coherent(), the
least size it allocates is PAGE_SIZE, so it will waste much memory for the
situation.

To solve the issue, limit the minimum allocation size of slot buf to
PAGE_SIZE.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Xiang Chen
d380f55503 scsi: hisi_sas: Don't bother clearing status buffer IU in task prep
For struct hisi_sas_status_buffer, it contains struct hisi_sas_err_record
and iu[1024]. The struct iu[1024] will be filled fully by the response of
disks, so it is not need to initialize them to 0, but for the struct
hisi_sas_err_record, SAS controller only fill some fields of
hisi_sas_err_record according to hw designer, so it should be initialised
to 0.  After the change, cpu utilization percentage of memset() is changed
from 1.7% to 0.12%.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Luo Jiaxing
445ee2de11 scsi: hisi_sas: Fix out of bound at debug_I_T_nexus_reset()
Fix a possible out-of-bounds access in hisi_sas_debug_I_T_nexus_reset().

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Luo Jiaxing
b0b3e4290e scsi: hisi_sas: Snapshot AXI and RAS register at debugfs
The AXI and RAS register values should also should be snapshot at debugfs.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:15 -04:00
Luo Jiaxing
bbe0a7b348 scsi: hisi_sas: Snapshot HW cache of IOST and ITCT at debugfs
The value of IOST/ITCT is updated to cache first, and then synchronize to
DDR periodically. So the value in IOST/ITCT cache is the latest data and
it's important for debugging.

So, the HW cache of IOST and ITCT should be snapshot at debugfs.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:14 -04:00
Luo Jiaxing
bee0cf25c0 scsi: hisi_sas: Fix pointer usage error in show debugfs IOST/ITCT
Fix how the pointer is set in hisi_sas_debugfs_iost_show() and
hisi_sas_debugfs_itct_show().

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:14 -04:00
John Garry
897cc769bc scsi: hisi_sas: Drop hisi_sas_hw.get_free_slot
In commit 1273d65f29 ("scsi: hisi_sas: change queue depth from 512 to
4096"), the depth of each queue is the same as the max IPTT in the system.

As such, as long as we have an IPTT allocated, we will have enough space on
any delivery queue.

All .get_free_slot functions were checking for space on the queue by
reading the DQ read pointer. Drop this, and also raise the code into common
code, as there is nothing hw specific remaining.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:14 -04:00
John Garry
93352abc81 scsi: hisi_sas: Make max IPTT count equal for all hw revisions
There is a small optimisation to be had by making the max IPTT the same for
all hw revisions, that being we can drop the check for read and write
pointer being the same in the get free slot function.

Change v1 hw to have max IPTT of 4096 - same as v2 and v3 hw - and
drop hisi_sas_hw.max_command_entries.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 22:13:14 -04:00
Linus Torvalds
ba6d10ab80 SCSI misc on 20190709
This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
 mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
 removal of the osst driver (I heard from Willem privately that he
 would like the driver removed because all his test hardware has
 failed).  Plus number of minor changes, spelling fixes and other
 trivia.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXSTl4yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdcxAQDCJVbd
 fPUX76/V1ldupunF97+3DTharxxbst+VnkOnCwD8D4c0KFFFOI9+F36cnMGCPegE
 fjy17dQLvsJ4GsidHy8=
 =aS5B
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
  mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
  removal of the osst driver (I heard from Willem privately that he
  would like the driver removed because all his test hardware has
  failed). Plus number of minor changes, spelling fixes and other
  trivia.

  The big merge conflict this time around is the SPDX licence tags.
  Following discussion on linux-next, we believe our version to be more
  accurate than the one in the tree, so the resolution is to take our
  version for all the SPDX conflicts"

Note on the SPDX license tag conversion conflicts: the SCSI tree had
done its own SPDX conversion, which in some cases conflicted with the
treewide ones done by Thomas & co.

In almost all cases, the conflicts were purely syntactic: the SCSI tree
used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the
treewide conversion had used the new-style ones ("GPL-2.0-only" and
"GPL-2.0-or-later").

In these cases I picked the new-style one.

In a few cases, the SPDX conversion was actually different, though.  As
explained by James above, and in more detail in a pre-pull-request
thread:

 "The other problem is actually substantive: In the libsas code Luben
  Tuikov originally specified gpl 2.0 only by dint of stating:

  * This file is licensed under GPLv2.

  In all the libsas files, but then muddied the water by quoting GPLv2
  verbatim (which includes the or later than language). So for these
  files Christoph did the conversion to v2 only SPDX tags and Thomas
  converted to v2 or later tags"

So in those cases, where the spdx tag substantially mattered, I took the
SCSI tree conversion of it, but then also took the opportunity to turn
the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag.

Similarly, when there were whitespace differences or other differences
to the comments around the copyright notices, I took the version from
the SCSI tree as being the more specific conversion.

Finally, in the spdx conversions that had no conflicts (because the
treewide ones hadn't been done for those files), I just took the SCSI
tree version as-is, even if it was old-style.  The old-style conversions
are perfectly valid, even if the "-only" and "-or-later" versions are
perhaps more descriptive.

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits)
  scsi: qla2xxx: move IO flush to the front of NVME rport unregistration
  scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition
  scsi: qla2xxx: on session delete, return nvme cmd
  scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices
  scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1
  scsi: megaraid_sas: Introduce various Aero performance modes
  scsi: megaraid_sas: Use high IOPS queues based on IO workload
  scsi: megaraid_sas: Set affinity for high IOPS reply queues
  scsi: megaraid_sas: Enable coalescing for high IOPS queues
  scsi: megaraid_sas: Add support for High IOPS queues
  scsi: megaraid_sas: Add support for MPI toolbox commands
  scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver
  scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura
  scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD
  scsi: megaraid_sas: Handle sequence JBOD map failure at driver level
  scsi: megaraid_sas: Don't send FPIO to RL Bypass queue
  scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault
  scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout
  scsi: megaraid_sas: Call disable_irq from process IRQ poll
  scsi: megaraid_sas: Remove few debug counters from IO path
  ...
2019-07-11 15:14:01 -07:00
John Garry
924a3541ea scsi: libsas: aic94xx: hisi_sas: mvsas: pm8001: Use dev_is_expander()
Many times in libsas, and in LLDDs which use libsas, the check for an
expander device is re-implemented or open coded.

Use dev_is_expander() instead. We rename this from
sas_dev_type_is_expander() to not spill so many lines in referencing.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-20 15:37:02 -04:00
Xiang Chen
97fcf176b4 scsi: hisi_sas: Disable stash for v3 hw
For v3 hw, stash is enabled to promote performance, but it does little to
improve performance according to current tests. What's more, it causes
exceptions for some situations, so disable it.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Luo Jiaxing
e4c19deba6 scsi: hisi_sas: Ignore the error code between phy down to phy up
Several error codes will be generated between PHY down to up.

This issue was introduced by HW design. The designers came to the
conclusion that we should ignore these errors.

Signed-off-by: Jiaxing Luo <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Xiang Chen
0ab7bc825a scsi: hisi_sas: Change the type of some numbers to unsigned
It reports a error as follows from some tools at two places in our code:
runtime error: left shift of 4 by 29 places cannot be represented in type
'int' So change the type of the two numbers to unsigned to avoid the error.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
John Garry
c7669f5012 scsi: hisi_sas: Reduce HISI_SAS_SGE_PAGE_CNT in size
Macro HISI_SAS_SGE_PAGE_CNT is defined to SG_CHUNK_SIZE, which is 128.

This means that sizeof(struct hisi_sas_slot_buf_table) is 4192. This is
just over a 4K, which can mean inefficient DMA memory usage (for no PI).

Reduce the size of HISI_SAS_SGE_PAGE_CNT to 124 to fit in a 4K page. With
this change, we experience no performance hit.

Cc: dann frazier <dann.frazier@canonical.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Xiaofei Tan
794327ab53 scsi: hisi_sas: Fix the issue of argument mismatch of printing ecc errors
The argument of dev_err() called by multi_bit_ecc_error_process_v3_hw() is
not right. We pass two arguments, but there is only one printk format
specifier in the string.

Also move the print format string to dev_err().

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Xiang Chen
6c86e046cf scsi: hisi_sas: Delete PHY timers when rmmod or probe failed
When removing the driver or when probe fails, we need to delete the PHY
timers.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Thomas Gleixner
2874c5fd28 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:32 -07:00
Thomas Gleixner
ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Xiang Chen
01d4e3a2fc scsi: hisi_sas: Some misc tidy-up
Do some minor tidy-up.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Luo Jiaxing
246ea3c0ad scsi: hisi_sas: Don't fail IT nexus reset for Open Reject timeout
Currently we call hisi_sas_softreset_ata_disk() in
hisi_sas_I_T_nexus_reset().

If this fails for open reject reason, there is no reason to fail the IT
nexus reset, so only fail for TMF_RESP_FUNC_FAILED.

Some other strings spilled over multiple lines are reunited.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Luo Jiaxing
a311570027 scsi: hisi_sas: Don't hard reset disk during controller reset
In the function of hisi_sas_init_device(), we added ops->hardreset() to
clear affiliation of STP target port or handle [STP pending] state.

Function hisi_sas_init_device() will be called when a device is found or
during controller reset. At controller reset, we call
hisi_sas_init_device() to re-init the disks, so calling hardreset() is
unnecessary and it also will cause some delay at controller reset.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiaofei Tan
3168d4f800 scsi: hisi_sas: Support all RAS events with MSI interrupts
This patch is to switch HW all error handling from PCI AER to MSI interrupt
due to non-standard PCI implementation. All HW errors which were being
reported through PCI AER can be reported through MSI interrupt also.

Do two things to complete the switch:

1. Notify FW to switch to MSI handling through ACPI DSM.

2. Add MSI handling for some hw errors, ECC errors and poison errors (we
   also call some of them AXI reuser error). They were handled only through
   PCI AER before.

For old FW reporting PCI AER events, the PCI AER handler will see that the
driver on longer support AER, and will leave the device in offlined state,
which is safe.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiang Chen
adb5b38c19 scsi: hisi_sas: allocate different SAS address for directly attached situation
In commit 8b8d665315 ("scsi: hisi_sas: make SAS address of SATA disks
unique"), we ensured that each SATA disk in the system has a unique SAS
address, even if it is fake. That was for v2 hw.

Add this for v3 hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiang Chen
18a54b329c scsi: hisi_sas: Adjust the printk format of functions hisi_sas_init_device()
In function hisi_sas_init_device(), the log is as follows when error for
hardreset:

  hisi_sas_v3_hw 0000:74:02.0: SATA disk hardreset fail: 0xffffffed

Actually if hardreset failed, its return value is negative, so change the
print format from %x to %d.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
John Garry
c63b88ccff scsi: hisi_sas: Fix for setting the PHY linkrate when disconnected
In commit efdcad62e7 ("scsi: hisi_sas: Set PHY linkrate when
disconnected"), we use the sas_phy_data.enable flag to track whether the
PHY was enabled or not, so that we know if we should set the PHY negotiated
linkrate at SAS_LINK_RATE_UNKNOWN or SAS_PHY_DISABLED.

However, it is not proper to use sas_phy_data.enable, since it is only set
when libsas attempts to set the PHY disabled/enabled; hence, it may not
even have an initial value.

As a solution to this problem, introduce hisi_sas_phy.enable to track
whether the PHY is enabled or not, so that we can set the negotiated
linkrate properly when the PHY comes down.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiang Chen
447f78c0e1 scsi: hisi_sas: Remedy inconsistent PHY down state in software
Currently there are two scenarioes which may cause PHY state of hardware
(which is 0) is inconsistent with the state held in software:

- Unplug SAS wire before get_phys_state when SAS controller reset, then the
  interrupts of phy down are ignored, phy state is 0 before reset, and it
  also gets 0 after reset, so phy down doesn't occur even if unplugged SAS
  wire;

- For v3 hw later version, it will close bus when 2 bit ECC error occurs.
  So if unplug SAS wire at that time, interrupts of phy down also not
  occur. So at last it will cause host reset. It also get phy state 0
  before and after reset, the same issue occurs.

To solve it, use hisi_sas_phy_down() directly in rescan topology function.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:12 -04:00
Xiang Chen
a97fa58680 scsi: hisi_sas: add host reset interface for test
Add host reset interface to make it easier for testing the host reset
feature.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:30:11 -04:00
Luo Jiaxing
0e83fc61ee scsi: hisi_sas: Add softreset in hisi_sas_I_T_nexus_reset()
We found out that for v2 hw, a SATA disk can not be written to after the
system comes up.

In commit ffb1c820b8 ("scsi: hisi_sas: remove the check of sas_dev status
in hisi_sas_I_T_nexus_reset()"), we introduced a path where we may issue an
internal abort for a SATA device, but without following it with a
softreset.

We need to always follow an internal abort with a software reset, as per HW
programming flow, so add this.

Fixes: ffb1c820b8 ("scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset()")
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-20 14:24:04 -04:00
Linus Torvalds
477558d7e8 SCSI misc on 20190315
This is the final round of mostly small fixes and performance
 improvements to our initial submit.  The main regression fix is the
 ia64 simscsi build failure which was missed in the serial number
 elimination conversion.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIxBayYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pisherpAP4rxLpX
 bcUnQnEsvoxys/JyoK08Qfv1JebZo1B2MAZ62wD/VZ7LpOuzVLhsM2KhLFGRrs1/
 7D2K4tgtO2dQsFix7H0=
 =pcHl
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "This is the final round of mostly small fixes and performance
  improvements to our initial submit.

  The main regression fix is the ia64 simscsi build failure which was
  missed in the serial number elimination conversion"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
  scsi: ia64: simscsi: use request tag instead of serial_number
  scsi: aacraid: Fix performance issue on logical drives
  scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup()
  scsi: libiscsi: Hold back_lock when calling iscsi_complete_task
  scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink
  scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port
  scsi: hisi_sas: Set PHY linkrate when disconnected
  scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw
  scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO
  scsi: hisi_sas: Change return variable type in phy_up_v3_hw()
  scsi: qla2xxx: check for kstrtol() failure
  scsi: lpfc: fix 32-bit format string warning
  scsi: lpfc: fix unused variable warning
  scsi: target: tcmu: Switch to bitmap_zalloc()
  scsi: libiscsi: fall back to sendmsg for slab pages
  scsi: qla2xxx: avoid printf format warning
  scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset
  scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check
  scsi: ufs: hisi: fix ufs_hba_variant_ops passing
  scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show
  ...
2019-03-16 12:51:50 -07:00
Linus Torvalds
92fff53b71 SCSI misc on 20190306
This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
 hisi_sas, target/iscsi and target/core.  Additionally Christoph
 refactored gdth as part of the dma changes.  The major mid-layer
 change this time is the removal of bidi commands and with them the
 whole of the osd/exofs driver and filesystem.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIC54SYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishT1GAPwJEV23
 ExPiPsnuVgKj49nLTagZ3rILRQcYNbL+MNYqxQEA0cT8FHzSDBfWY5OKPNE+RQ8z
 f69LpXGmMpuagKGvvd4=
 =Fhy1
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
  hisi_sas, target/iscsi and target/core.

  Additionally Christoph refactored gdth as part of the dma changes. The
  major mid-layer change this time is the removal of bidi commands and
  with them the whole of the osd/exofs driver and filesystem. This is a
  major simplification for block and mq in particular"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits)
  scsi: cxgb4i: validate tcp sequence number only if chip version <= T5
  scsi: cxgb4i: get pf number from lldi->pf
  scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
  scsi: mpt3sas: Add missing breaks in switch statements
  scsi: aacraid: Fix missing break in switch statement
  scsi: kill command serial number
  scsi: csiostor: drop serial_number usage
  scsi: mvumi: use request tag instead of serial_number
  scsi: dpt_i2o: remove serial number usage
  scsi: st: osst: Remove negative constant left-shifts
  scsi: ufs-bsg: Allow reading descriptors
  scsi: ufs: Allow reading descriptor via raw upiu
  scsi: ufs-bsg: Change the calling convention for write descriptor
  scsi: ufs: Remove unused device quirks
  Revert "scsi: ufs: disable vccq if it's not needed by UFS device"
  scsi: megaraid_sas: Remove a bunch of set but not used variables
  scsi: clean obsolete return values of eh_timed_out
  scsi: sd: Optimal I/O size should be a multiple of physical block size
  scsi: MAINTAINERS: SCSI initiator and target tweaks
  scsi: fcoe: make use of fip_mode enum complete
  ...
2019-03-09 16:53:47 -08:00
Xiang Chen
cf9efd5d92 scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink
With default value of register SERDES_CFG, the link is not stable for some
special disks when running IO. According to HW guys' suggestion, need to
make the bit10~19 value of register SERDES_CFG the max value to increase
the reliability of the HiLink.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Yupeng Zhou <zhouyupeng1@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:46 -05:00
Xiang Chen
57dbb2b218 scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port
If we exchange SAS expander from one SAS controller to other SAS controller
without powering it down, the STP target port will maintain previous
affiliation and reject all subsequent connection requests from other STP
initiator ports with OPEN_REJECT (STP RESOURCES BUSY).

To solve this issue, send HARD RESET to clear the previous affiliation of
STP target port according to SPL (chapter 6.19.4).

We (re-)introduce dev status flag to know if to sleep in NEXUS reset code
or not for remote PHYs. The idea is that if the device is being
initialised, we don't require the delay, and caller would wait for link to
be established, cf. sas_ata_hard_reset().

Co-developed-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:46 -05:00
John Garry
efdcad62e7 scsi: hisi_sas: Set PHY linkrate when disconnected
When the PHY comes down, we currently do not set the negotiated linkrate:

root@(none)$ pwd
/sys/class/sas_phy/phy-0:0
root@(none)$ more enable
1
root@(none)$ more negotiated_linkrate
12.0 Gbit
root@(none)$ echo 0 > enable
root@(none)$ more negotiated_linkrate
12.0 Gbit
root@(none)$

This patch fixes the driver code to set it properly when the PHY comes
down.

If the PHY had been enabled, then set unknown; otherwise, flag as disabled.

The logical place to set the negotiated linkrate for this scenario is PHY
down routine, which is called from the PHY down ISR.

However, it is not possible to know if the PHY comes down due to PHY
disable or loss of link, as sas_phy.enabled member is not set until after
the transport disable routine is complete, which races with the PHY down
ISR.

As an imperfect solution, use sas_phy_data.enable as the flag to know if
the PHY is down due to disable. It's imperfect, as sas_phy_data is internal
to libsas.

I can't see another way without adding a new field to hisi_sas_phy and
managing it, or changing SCSI SAS transport.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:46 -05:00
Xiaofei Tan
aaeb82323d scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw
The later revision of v3 hw has added an function of interrupt coalesce
according to time for PHY RX errors. We set the coalesce time to 1s.  Then
we print PHY RX errors count when PHY RX errors happen, and don't need to
worry that there may be too much log prints.

Besides, we use hisi_sas_phy.lock to protect error count value. Because we
update them by calling phy_get_events_v3_hw(), which is also used by core
driver (for get PHY events function).

We relocate phy_get_events_v3_hw() to avoid a further declaration.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:46 -05:00
Xiang Chen
4790595723 scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO
For internal IO and SMP IO, there is a time-out timer for them. In the
timer handler, it checks whether IO is done according to the flag
task->task_state_lock.

There is an issue which may cause system suspended: internal IO or SMP IO
is sent, but at that time because of hardware exception (such as inject
2Bit ECC error), so IO is not completed and also not timeout. But, at that
time, the SAS controller reset occurs to recover system. It will release
the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO
timeout, it will never complete the completion of IO and wait for ever.

[  729.123632] Call trace:
[  729.126791] [<ffff00000808655c>] __switch_to+0x94/0xa8
[  729.133106] [<ffff000008d96e98>] __schedule+0x1e8/0x7fc
[  729.138975] [<ffff000008d974e0>] schedule+0x34/0x8c
[  729.144401] [<ffff000008d9b000>] schedule_timeout+0x1d8/0x3cc
[  729.150690] [<ffff000008d98218>] wait_for_common+0xdc/0x1a0
[  729.157101] [<ffff000008d98304>] wait_for_completion+0x28/0x34
[  729.165973] [<ffff000000dcefb4>] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main]
[  729.176447] [<ffff000000dd18f4>] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main]
[  729.185258] [<ffff000008971714>] sas_eh_handle_sas_errors+0x1c8/0x7b8
[  729.192391] [<ffff000008972774>] sas_scsi_recover_host+0x130/0x398
[  729.199237] [<ffff00000894d8a8>] scsi_error_handler+0x148/0x5c0
[  729.206009] [<ffff0000080f4118>] kthread+0x10c/0x138
[  729.211563] [<ffff0000080855dc>] ret_from_fork+0x10/0x18

To solve the issue, callback function task_done of those IOs need to be
called when on SAS controller reset.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:46 -05:00
Xiang Chen
fba770c668 scsi: hisi_sas: Change return variable type in phy_up_v3_hw()
According to the tool fortify, phy_up_v3_hw() returns signed value, while
it should return an unsigned value.

So change variable "res" from int to irq_return_t.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:46 -05:00
Hannes Reinecke
d9a00459ef scsi: hisi_sas: fix calls to dma_set_mask_and_coherent()
The change to use dma_set_mask_and_coherent() incorrectly made a second
call with the 32 bit DMA mask value when the call with the 64 bit DMA
mask value succeeded.

[mkp: fixed commit message]

Fixes: e4db40e7a1 ("scsi: hisi_sas: use dma_set_mask_and_coherent")
Cc: <stable@vger.kernel.org>
Suggested-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-25 21:44:40 -05:00
John Garry
4a8bec88f7 scsi: hisi_sas: Do some more tidy-up
Do some very minor tidy-up, for things like needlessly initing variable and
not leaving whitespace before quote endings.

Originally-from: Xiang Chen <chenxiang66@hisilicon.com>
Originally-from: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiang Chen
4fefe5bbf5 scsi: hisi_sas: Use pci_irq_get_affinity() for v3 hw as experimental
For auto-control irq affinity mode, choose the dq to deliver IO according
to the current CPU.

Then it decreases the performance regression that fio and CQ interrupts are
processed on different node.

For user control irq affinity mode, keep it as before.

To realize it, also need to distinguish the usage of dq lock and sas_dev
lock.

We mark as experimental due to ongoing discussion on managed MSI IRQ
during hotplug:
https://marc.info/?l=linux-scsi&m=154876335707751&w=2

We're almost at the point where we can expose multiple queues to the upper
layer for SCSI MQ, but we need to sort out the per-HBA tags performance
issue.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
John Garry
795f25a31b scsi: hisi_sas: Issue internal abort on all relevant queues
To support queue mapped to a CPU, it needs to be ensured that issuing an
internal abort is safe, in that it is guaranteed that an internal abort is
processed for a single IO or a device after all the relevant command(s)
which it is attempting to abort have been processed by the controller.

Currently we only deliver commands for any device on a single queue to
solve this problem, as we know that commands issued on the same queue will
be processed in order, and we will not have a scenario where the internal
abort is racing against a command(s) which it is trying to abort.

To enqueue commands on queue mapped to a CPU, choosing a queue for an
command is based on the associated queue for the current CPU, so this is
not safe for internal abort since it would definitely not be guaranteed
that commands for the command devices are issued on the same queue.

To solve this issue, we take a bludgeoning approach, and issue a separate
internal abort on any queue(s) relevant to the command or device, in that
we will be guaranteed that at least one of these internal aborts will be
received last in the controller.

So, for aborting a single command, we can just force the internal abort to
be issued on the same queue as the command which we are trying to abort.

For aborting all commands associated with a device, we issue a separate
internal abort on all relevant queues. Issuing multiple internal aborts in
this fashion would have not side affect.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiang Chen
1273d65f29 scsi: hisi_sas: change queue depth from 512 to 4096
If sending IOs to many disks from single queue, it is possible that the
queue may be full. To avoid the situation, change queue depth from 512 to
4096 which is the max number of IOs for v3 hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Luo Jiaxing
7c5e136363 scsi: hisi_sas: Add manual trigger for debugfs dump
Add an interface to manually trigger a debugfs dump.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiang Chen
b3cce125cb scsi: hisi_sas: Add support for DIX feature for v3 hw
This patch adds support for DIX to v3 hw driver.

For this, we build upon support for DIF, most significantly is adding new
DMA map and unmap paths.

Some pre-existing macro precedence issues are also tidied. They were
detected by checkpatch --strict.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:21 -05:00
John Garry
ede2afb9c8 scsi: hisi_sas: Add missing seq_printf() call in hisi_sas_show_row_32()
This call must have been missed when I reworked the debugfs feature for
upstreaming, so add it back.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
John Garry
e1ba0b0b44 scsi: hisi_sas: Fix to only call scsi_get_prot_op() for non-NULL scsi_cmnd
A NULL-pointer dereference was introduced for TMF SSP commands from the
upstreaming reworking.

Fix this by relocating the scsi_get_prot_op() callsite.

Fixes: d6a9000b81 ("scsi: hisi_sas: Add support for DIF feature for v2 hw")
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
John Garry
26889e5ec8 scsi: hisi_sas: Some misc tidy-up
Sparse detected some problems in the driver, so tidy them up.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Luo Jiaxing
d1548e9c32 scsi: hisi_sas: Correct memory allocation size for DQ debugfs
Some sizes we allocate for debugfs structure are incorrect, so fix them.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Xiaofei Tan
b6c9b15e44 scsi: hisi_sas: Fix losing directly attached disk when hot-plug
Hot-plugging SAS wire of direct hard disk backplane may cause disk lost. We
have done this test with several types of SATA disk from different venders,
and only two models from Seagate has this problem, ST4000NM0035-1V4107 and
ST3000VM002-1ET166.

The root cause is that the disk doesn't send D2H frame after OOB finished.
SAS controller will issue phyup interrupt only when D2H frame is received,
otherwise, will be waiting there all the time.

When this issue happen, we can find the disk again with link reset.  To fix
this issue, we setup an timer after OOB finished. If the PHY is not up in
20s, do link reset. Notes: the 20s is an experience value.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Luo Jiaxing
eb44e4d7b5 scsi: hisi_sas: Reject setting programmed minimum linkrate > 1.5G
The SAS controller cannot support a programmed minimum linkrate of > 1.5G
(it will always negotiate to 1.5G at least), so just reject it.

This solves a strange situation where the PHY negotiated linkrate may be
less than the programmed minimum linkrate.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Xiang Chen
ae68b566e0 scsi: hisi_sas: Remove unused parameter of function hisi_sas_alloc()
In function hisi_sas_alloc(), parameter shost is not used, so remove it.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Xiang Chen
ffb1c820b8 scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset()
When issing a hardreset to a SATA device when running IO, it is possible
that abnormal CQs of the device are returned. Then enter error handler, it
doesn't enter function hisi_sas_abort_task() as there is no timeout IO, and
it doesn't set device as HISI_SAS_DEV_EH. So when hardreset by libata
later, it actually doesn't issue hardreset as there is a check to judge
whether device is in error.

For this situation, actually need to hardreset the device to recover.
So remove the check of sas_dev status in hisi_sas_I_T_nexus_reset().

Before we add the check to avoid the endless loop of reset for
directly-attached SATA device at probe time, actually we flutter it for
it, so it is not necessary to add the check now.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Xiang Chen
5c31b0c677 scsi: hisi_sas: shutdown axi bus to avoid exception CQ returned
When injecting 2 bit ECC error, it will cause fatal AXI interrupts. Before
the recovery of SAS controller reset, the internal of SAS controller is in
error. If CQ interrupts return at the time, actually it is exception CQ
interrupt, and it may cause resource release in disorder.

To avoid the exception situation, shutdown AXI bus after fatal AXI
interrupt. In SAS controller reset, it will restart AXI bus. For later
version of v3 hw, hardware will shutdown AXI bus for this situation, so
just fix current ver of v3 hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Xiang Chen
569eddcf3a scsi: hisi_sas: send primitive NOTIFY to SSP situation only
Send primitive NOTIFY to SSP situation only, or it causes underflow issue
when sending IO. Also rename hisi_sas_hw.sl_notify() to hisi_sas_hw.
sl_notify_ssp().

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:20 -05:00
Luo Jiaxing
5979f33b98 scsi: hisi_sas: Add debugfs ITCT file and add file operations
This patch creates debugfs file for ITCT and adds file operations.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:19 -05:00
John Garry
5b0eeac4be scsi: hisi_sas: Fix type casting and missing static qualifier in debugfs code
Sparse can detect some type casting issues in the debugfs code, so fix it
up.

Also a missing static qualifier is added to hisi_sas_debugfs_to_reg_name().

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 01:41:19 -05:00