Commit Graph

418 Commits

Author SHA1 Message Date
Andreas Gruenbacher
5476169793 drbd: Rename struct drbd_conf -> struct drbd_device
sed -i -e 's:\<drbd_conf\>:drbd_device:g'

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:36:44 +01:00
Andreas Gruenbacher
a3603a6e3b drbd: Split off on-the-wire protocol definitions
Keep the protocol definitions separate from the kernel code; they are useful in
their own right.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:27:49 +01:00
Philipp Reisner
8e22943430 drbd: Add missing error goto
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2014-02-17 16:24:47 +01:00
Rashika Kheria
4b7a530f6b drivers: block: Mark functions as static in drbd_nl.c
Mark functions conn_khelper(), nla_put_drbd_cfg_context(),
nla_put_status_info() and get_one_status() as static in drbd/drbd_nl.c
because they are not used outside this file.

This eliminates the following warnings in drbd/drbd_nl.c:
drivers/block/drbd/drbd_nl.c:365:5: warning: no previous prototype for ‘conn_khelper’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_nl.c:2727:5: warning: no previous prototype for ‘nla_put_drbd_cfg_context’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_nl.c:2753:5: warning: no previous prototype for ‘nla_put_status_info’ [-Wmissing-prototypes]
drivers/block/drbd/drbd_nl.c:2895:5: warning: no previous prototype for ‘get_one_status’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2014-02-17 16:19:38 +01:00
Lars Ellenberg
35f47ef1a1 drbd: avoid to shrink max_bio_size due to peer re-configuration
For a long time, the receiving side has spread "too large" incoming
requests over multiple bios.  No need to shrink our max_bio_size
(max_hw_sectors) if the peer is reconfigured to use a different storage.

The problem manifests itself if we are not the top of the device stack
(DRBD is used a LVM PV).

A hardware reconfiguration on the peer may cause the supported
max_bio_size to shrink, and the connection handshake would now
unnecessarily shrink the max_bio_size on the active node.

There is no way to notify upper layers that they have to "re-stack"
their limits. So they won't notice at all, and may keep submitting bios
that are suddenly considered "too large for device".

We already check for compatibility and ignore changes on the peer,
the code only was masked out unless we have a fully established connection.
We just need to allow it a bit earlier during the handshake.

Also consider max_hw_sectors in our merge bvec function, just in case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:29 -07:00
Philipp Reisner
57737adc96 drbd: Fix adding of new minors with freshly created meta data
Online adding of new minors with freshly created meta data
to an resource with an established connection failed, with a
wrong state transition on one side on one side of the new minor.

Freshly created meta-data has a la_size (last agreed size) of 0.
When we online add such devices, the code wrongly got into
the code path for resyncing new storage that was added while
the disk was detached.

Fixed that by making the GREW from ZERO a special case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-11-08 09:10:28 -07:00
Philipp Reisner
d752b26960 drbd: Allow online change of al-stripes and al-stripe-size
Allow to change the AL layout with an resize operation. For that
the reisze command gets two new fields: al_stripes and al_stripe_size.

In order to make the operation crash save:
1) Lock out all IO and MD-IO
2) Write the super block with MDF_PRIMARY_IND clear
3) write the bitmap to the new location (all zeros, since
   we allow only while connected)
4) Initialize the new AL-area
5) Write the super block with the restored MDF_PRIMARY_IND.
6) Unfreeze all IO

Since the AL-layout has no influence on the protocol, this operation
needs to be beforemed on both sides of a resource (if intended).

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-06-28 16:04:36 +02:00
Philipp Reisner
e96c96333f drbd: Constants should be UPPERCASE
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-06-28 16:04:36 +02:00
Philipp Reisner
28e448bb30 drbd: Ignore the exit code of a fence-peer handler if it returns too late
In case the connection was established and lost again before
the a fence-peer handler returns, ignore the exit code of this
instance. (And use the exit code of the later started instance)

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-06-28 16:04:36 +02:00
Andreas Gruenbacher
f9eb7bf424 drbd: Fix rcu_read_lock balance on error path
Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-06-28 16:04:36 +02:00
Lars Ellenberg
a3f8f7dc7a drbd: validate resync_after dependency on attach already
We validated resync_after dependencies, if changed via disk-options.
But we did not validate them when first created via attach.
We also did not check or cleanup dependencies that used to be correct,
but now point to meanwhile removed minor devices.

If the drbd_resync_after_valid() validation in disk-options tried to
follow a dependency chain in this way, this could lead to NULL pointer
dereference.

Validate resync_after settings in drbd_adm_attach() already, as well as
in drbd_adm_disk_opts(), and and only reject dependency loops.
Depending on non-existing disks is allowed and equivalent to no dependency.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28 10:10:25 -06:00
Philipp Reisner
2bd5ed5d67 drbd: Fix disconnect to keep the peer disk state if connection breaks during operation
The issue was that if the connection broke while we did the
gracefull state change to C_DISCONNECTING (C_TEARDOWN), then
we returned a success code from the state engine. (SS_CW_NO_NEED)

The result of that is that we missed to call the fence-peer
script in such a case.

Fixed that by introducing a new error code (SS_OUTDATE_WO_CONN).
This one should never reach back into user space.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28 10:10:25 -06:00
Philipp Reisner
0b2dafcd9f drbd: drop now useless duplicate state request from invalidate
Patch best viewed with git diff --ignore-space-change.

Now that we attempt the fallback to local bitmap operation
only when disconnected, we can safely drop the extra "silent"
state request from both invalidate and invalidate-remote.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28 10:10:24 -06:00
Philipp Reisner
9376d9f8b9 drbd: move invalidating the whole bitmap out of after_state ch()
To avoid other state change requests, after passing through
sanitize_state(), to be mistaken for an invalidate,
move the "set all bits as out-of-sync" into the invalidate path.

Make invalidate and invalidate-remote behave consistently wrt.
current connection state (need either an established replication link,
or really be disconnected). Also mention that in the documentation.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28 10:10:24 -06:00
Lars Ellenberg
5bbcf5e6ab drbd: adjust upper limit for activity log extents
Now that the on-disk activity-log ring buffer size is adjustable,
the maximum active set can become larger, and is now limited by
the use of 16bit "labels".

This increases the maximum working set from 6433 to 65534 extents,
each of which covers an area of 4MiB.
Which means that if you use the maximum, you'd have to resync
more than 250 GiB after an unclean Primary shutdown.
With capable backend storage and replication links,
this is entirely feasible.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 22:18:09 -06:00
Lars Ellenberg
113fef9e20 drbd: prepare to queue write requests on a submit worker
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:14:40 -06:00
Lars Ellenberg
c04ccaa669 drbd: read meta data early, base on-disk offsets on super block
We used to calculate all on-disk meta data offsets, and then compare
the stored offsets, basically treating them as magic numbers.

Now with the activity log striping, the activity log size is no longer
fixed.  We need to first read the super block, then base the activity
log and bitmap offsets on the stored offsets/al stripe settings.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Lars Ellenberg
cccac9857d drbd: mechanically rename la_size to la_size_sect
Make it obvious that this value is in units of 512 Byte sectors.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Lars Ellenberg
68e41a43f1 drbd: use the cached meta_dev_idx
Now we have the cached meta_dev_idx member,
we can get rid of a few rcu_read_lock() sections and rcu_dereference().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Lars Ellenberg
3a4d4eb3cb drbd: prepare for new striped layout of activity log
Introduce two new on-disk meta data fields: al_stripes and al_stripe_size_4k
The intended use case is activity log on RAID 0 or similar.
Logically consecutive transactions will advance their on-disk position
by al_stripe_size_4k 4kB (transaction sized) blocks.

Right now, these are still asserted to be the backward compatible
values al_stripes = 1, al_stripe_size_4k = 8 (which amounts to 32kB).

Also introduce a caching member for meta_dev_idx in the in-core
structure: even though it is initially passed in in the rcu-protected
disk_conf structure, it cannot change without a detach/attach cycle.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Lars Ellenberg
ae8bf312e9 drbd: cleanup ondisk meta data layout calculations and defines
Add a comment about our meta data layout variants,
and rename a few defines (e.g. MD_RESERVED_SECT -> MD_128MB_SECT)
to make it clear that they are short hand for fixed constants,
and not arbitrarily to be redefined as one may see fit.

Properly pad struct meta_data_on_disk to 4kB,
and initialize to zero not only the first 512 Byte,
but all of it in drbd_md_sync().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22 18:13:59 -06:00
Philipp Reisner
ef86b77957 drbd: Fix drbdsetup wait-connect, wait-sync etc... commands
This was introduces when moving the code over from the 8.3 codebase
with commit 328e0f125b

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-12-06 13:04:34 +01:00
Lars Ellenberg
691631c065 drbd: respect no-md-barriers setting also when changed online via disk-options
We need to propagate the configuration into the flag bits,
or it won't be effective.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-12-06 13:00:04 +01:00
Philipp Reisner
986836503e Merge branch 'drbd-8.4_ed6' into for-3.8-drivers-drbd-8.4_ed6 2012-11-09 14:20:23 +01:00
Philipp Reisner
328e0f125b drbd: Broadcast sync progress no more often than once per second
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:11:43 +01:00
Philipp Reisner
4035e4c2eb drbd: Fix clearing of MDF_AL_DISABLED
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:11:42 +01:00
Lars Ellenberg
edc9f5eb7a drbd: always write bitmap on detach
If we detach due to local read-error (which sets a bit in the bitmap),
stay Primary, and then re-attach (which re-reads the bitmap from disk),
we potentially lost the "out-of-sync" (or, "bad block") information in
the bitmap.

Always (try to) write out the changed bitmap pages before going diskless.

That way, we don't lose the bit for the bad block,
the next resync will fetch it from the peer, and rewrite
it locally, which may result in block reallocation in some
lower layer (or the hardware), and thereby "heal" the bad blocks.

If the bitmap writeout errors out as well, we will (again: try to)
mark the "we need a full sync" bit in our super block,
if it was a READ error; writes are covered by the activity log already.

If that superblock does not make it to disk either, we are sorry.

Maybe we just lost an entire disk or controller (or iSCSI connection),
and there actually are no bad blocks at all, so we don't need to
re-fetch from the peer, there is no "auto-healing" necessary.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:11:41 +01:00
Philipp Reisner
19fffd7b03 drbd: Call drbd_md_sync() explicitly after a state change on the connection
Without this, the meta-data gets updates after 5 seconds by the
md_sync_timer. Better to do it immeditaly after a state change.

If the asender detects a network failure, it may take a bit until
the worker processes the according after-conn-state-change work item.

  The worker might be blocked in sending something, i.e. it
  takes until it gets into its timeout. That is 6 seconds by
  default which is longer than the 5 seconds of the md_sync_timer.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:11:08 +01:00
Lars Ellenberg
0ee98e2eb0 drbd: temporarily suspend io in drbd_adm_disk_opts
drbd_adm_disk_opts() does
	wait_event(mdev->al_wait, lc_try_lock(mdev->act_log));
	drbd_al_shrink(mdev);

If the device is very busy, this can take a very long time to succeed.
Fix this by temporarily suspending IO,
then quickly change the settings, and resume.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:20 +01:00
Philipp Reisner
39a1aa7f49 drbd: Protect accesses to the uuid set with a spinlock
There is at least the worker context, the receiver context, the context of
receiving netlink packts.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:08:04 +01:00
Philipp Reisner
fef45d297e drbd: Write all pages of the bitmap after an online resize
We need to write the whole bitmap after we moved the meta data
due to an online resize operation.

With the support for one peta byte devices bitmap IO was optimized
to only write out touched pages. This optimization must be turned
off when writing the bitmap after an online resize.

This issue was introduced with drbd-8.3.10.

The impact of this bug is that after an online resize, the next
resync could become larger than expected.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:51 +01:00
Lars Ellenberg
eb12010e9a drbd: disambiguation, s/ERR_DISCARD/ERR_DISCARD_IMPOSSIBLE/
If for some reason (typically "split-brained" cluster manager)
drbd replica data has diverged, we can chose a victim,
and reconnect using "--discard-my-data", causing the victim
to become sync-target, fetching all changed blocks from the peer.

If we are Primary, we are potentially in use, and we refuse to
"roll back" changes to the data below the page cache and other users.

Rename the error symbol for this to ERR_DISCARD_IMPOSSIBLE.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:50 +01:00
Lars Ellenberg
427c0434fc drbd: disambiguation, s/DISCARD_CONCURRENT/RESOLVE_CONFLICTS/
We don't discard anything here, really.
We resolve conflicting, concurrent writes to overlapping data blocks.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:49 +01:00
Philipp Marek
3174f8c504 drbd: pass some more information to userspace.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:45 +01:00
Lars Ellenberg
58ffa580a7 drbd: introduce stop-sector to online verify
We now can schedule only a specific range of sectors for online verify,
or interrupt a running verify without interrupting the connection.

Had to bump the protocol version differently, we are now 101.
Added verify_can_do_stop_sector() { protocol >= 97 && protocol != 100; }

Also, the return value convention for worker callbacks has changed,
we returned "true/false" for "keep the connection up" in 8.3,
we return 0 for success and <= for failure in 8.4.
Affected: receive_state()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-09 14:05:32 +01:00
Lars Ellenberg
970fbde1f1 drbd: flush drbd work queue before invalidate/invalidate remote
If you do back to back wait-sync/invalidate on a Primary in a tight loop,
during application IO load, you could trigger a race:
  kernel: block drbd6: FIXME going to queue 'set_n_write from StartingSync'
    but 'write from resync_finished' still pending?

Fix this by changing the order of the drbd_queue_work() and
the wake_up() in dec_ap_pending(), and adding the additional
drbd_flush_workqueue() before requesting the full sync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:41 +01:00
Lars Ellenberg
a324896b17 drbd: do not reset rs_pending_cnt too early
Fix asserts like
  block drbd0: in got_BlockAck:4634: rs_pending_cnt = -35 < 0 !

We reset the resync lru cache and related information (rs_pending_cnt),
once we successfully finished a resync or online verify, or if the
replication connection is lost.

We also need to reset it if a resync or online verify is aborted
because a lower level disk failed.

In that case the replication link is still established,
and we may still have packets queued in the network buffers
which want to touch rs_pending_cnt.

We do not have any synchronization mechanism to know for sure when all
such pending resync related packets have been drained.

To avoid this counter to go negative (and violate the ASSERT that it
will always be >= 0), just do not reset it when we lose a disk.

It is good enough to make sure it is re-initialized before the next
resync can start: reset it when we re-attach a disk.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:40 +01:00
Lars Ellenberg
6f3465ed82 drbd: report congestion if we are waiting for some userland callback
If the drbd worker thread is synchronously waiting for some userland
callback, we don't want some casual pageout to block on us.
Have drbd_congested() report congestion in that case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:39 +01:00
Lars Ellenberg
0c84966601 drbd: differentiate between normal and forced detach
Aborting local requests (not waiting for completion from the lower level
disk) is dangerous: if the master bio has been completed to upper
layers, data pages may be re-used for other things already.
If local IO is still pending and later completes,
this may cause crashes or corrupt unrelated data.

Only abort local IO if explicitly requested.
Intended use case is a lower level device that turned into a tarpit,
not completing io requests, not even doing error completion.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:39 +01:00
Lars Ellenberg
27012382bc drbd: take error path in drbd_adm_down if interrupted by signal
drbd_adm_down() does adm_detach(), which can fail with various error
codes, or be interrupted by a signal.

The interrupted by signal case was not properly handled,
leading to
	block drbd0: ASSERT( mdev->state.disk == D_DISKLESS &&
	                     mdev->state.conn == C_STANDALONE ) in drbd/drbd_worker.c
and further to destroying objects while still in use, and resulting crashes.

Detect the interruption, and take the error path out.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:37 +01:00
Lars Ellenberg
b6dd1a8976 drbd: remove struct drbd_tl_epoch objects (barrier works)
cherry-picked and adapted from drbd 9 devel branch

DRBD requests (struct drbd_request) are already on the per resource
transfer log list, and carry their epoch number. We do not need to
additionally link them on other ring lists in other structs.

The drbd sender thread can recognize itself when to send a P_BARRIER,
by tracking the currently processed epoch, and how many writes
have been processed for that epoch.

If the epoch of the request to be processed does not match the currently
processed epoch, any writes have been processed in it, a P_BARRIER for
this last processed epoch is send out first.
The new epoch then becomes the currently processed epoch.

To not get stuck in drbd_al_begin_io() waiting for P_BARRIER_ACK,
the sender thread also needs to handle the case when the current
epoch was closed already, but no new requests are queued yet,
and send out P_BARRIER as soon as possible.

This is done by comparing the per resource "current transfer log epoch"
(tconn->current_tle_nr) with the per connection "currently processed
epoch number" (tconn->send.current_epoch_nr), while waiting for
new requests to be processed in wait_for_work().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:35 +01:00
Philipp Reisner
9a51ab1c1b drbd: New disk option al-updates
By disabling al-updates one might increase performace. The price for
that is that in case a crashed primary (that had al-updates disabled)
is reintegraded, it will receive a full-resync instead of a bitmap
based resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:31 +01:00
Andreas Gruenbacher
26ec92871b drbd: Stop using NLA_PUT*().
These macros no longer exist in kernel version v3.5-rc1.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:30 +01:00
Lars Ellenberg
5016b82a49 drbd: fix race between drbdadm invalidate/verify and finishing resync
When a resync or online verify is finished or aborted,
drbd does a bulk write-out of changed bitmap pages.

If *in that very moment* a new verify or resync is triggered,
this can race:
 ASSERT( !test_bit(BITMAP_IO, &mdev->flags) ) in drbd_main.c
 FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending?
and similar.

This can be observed with e.g. tight invalidate loops in test scripts,
and probably has no real-life implication.

Still, that race can be solved by first quiescen the device,
before starting a new resync or verify.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:27 +01:00
Philipp Reisner
a1096a6e9d drbd: Delay/reject other state changes while establishing a connection
Changes to the role and disk state should be delayed or rejected
while we establish a connection.

This is necessary, since the peer will base its resync decision
on the UUIDs and the state we sent in the drbd_connect() function.

The most prominent example for this race is becoming primary after
sending state and UUIDs and before the state changes to C_WF_CONNECTION.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:26 +01:00
Philipp Reisner
27eb13e99b drbd: Fixed processing of disk-barrier, disk-flushes and disk-drain
Since drbd_bump_write_ordering() is called in the attaching
process while the disk state is D_ATTACHING, it was not
considering these three flags during attach.

A call to this function was missing form drbd_adm_disk_opts().

Fixed both issues.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:25 +01:00
Philipp Reisner
25b0d6c8c1 drbd: Reinstate disabling AL updates with invalidate-remote
Commit d0ef827e (drbd: switch configuration interface from connector to
genetlink) introduced a regression by removing the ability to set all
bits in the out of sync bitmap and to suspend updates to the activity log
of a disconnected device via the invalidate-remote management call.

Credits for reporting the issue are going to Arne Redlich.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:15 +01:00
Philipp Reisner
4b0007c0e8 drbd: Move write_ordering from mdev to tconn
This is necessary in order to prepare the move of the (receiver side)
epoch list from the device (mdev) to the connection (tconn) objects.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:07 +01:00
Philipp Reisner
43de7c852b drbd: Fixes from the drbd-8.3 branch
* drbd-8.3:
  drbd: O_SYNC gives EIO on ramdisks for some kernels (eg. RHEL6).
  drbd: send intermediate state change results to the peer

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:06 +01:00
Philipp Reisner
0cfac5dd90 drbd: Fixes from the drbd-8.3 branch
* drbd-8.3:
  drbd: fix spurious meta data IO "error"
  drbd: Fixed a race condition between detach and start of resync
  drbd: fix harmless race to not trigger an ASSERT
  drbd: Derive sync-UUIDs only from the bitmap-uuid if it is non-zero
  drbd: Fixed current UUID generation (regression introduced recently, after 8.3.11)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:05 +01:00
Philipp Reisner
9bcd252182 drbd: fix "stalled" empty resync
With sync-after dependencies, given "lucky" timing of pause/unpause
events, and the end of an empty (0 bits set) resync was sometimes not
detected on the SyncTarget, leading to a "stalled" SyncSource state.

Fixed this by expecting not only "Inconsistent -> UpToDate" but also
"Consistent -> UpToDate" transitions for the peer disk state
to end a resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:04 +01:00
Lars Ellenberg
25e409321a drbd: fix connect failure with all default net-options
If no net-options are configured (all on their default),
no DRBD_NLA_NET_CONF will be passed to the kernel.
The kernel must not require its presence,
there is no required option in there.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:03 +01:00
Andreas Gruenbacher
a209b4aec3 drbd: Update some outdated comments to match the code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:02 +01:00
Philipp Reisner
c4e7afdc01 drbd: Remove unused code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:58:02 +01:00
Andreas Gruenbacher
7d4c782cbd drbd: Fix the data-integrity-alg setting
The last data-integrity-alg fix made data integrity checking work when the
algorithm was changed for an established connection, but the common case of
configuring the algorithm before connecting was still broken.  Fix that.

Signed-off-by: Andreas Gruenbacher <agruen@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:59 +01:00
Andreas Gruenbacher
f2257a56ee drbd: Allow to create devices with a minor number > minor_count
The minor_count module/kernel parameter serves to scale the size of drbd's
internal memory pool, but it is no longer a limit for the number of minors or
the minor number.  (Minor numbers can be arbitrarily high within the allowed
limit of 2^20.)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:54 +01:00
Lars Ellenberg
367d675da8 drbd: report net config even for resources without a single volume
Currently it is legal (though unusual) to create and connect a resource,
before adding in all necessary volumes. We should include the network
configuration details, even if we don't have a single volume (yet).

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:53 +01:00
Philipp Reisner
e0e1665381 drbd: Correctly handle resources without volumes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:52 +01:00
Philipp Reisner
369bea6371 drbd: Fixed removal of volumes/devices from connected resources
When removing a volume/device we need to switch the connection
status of the peer back into WFReportParams.

  Before this fix it was left in Connected state. That means that
  the peer device continued to inform us about state changes, etc...
  But we deleted that minor -> protocol error.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:51 +01:00
Lars Ellenberg
d5d7ebd422 drbd: on attach, enforce clean meta data
Detection of unclean shutdown has moved into user space.

The kernel code will, whenever it updates the meta data, mark it as
"unclean", and will refuse to attach to such unclean meta data.

"drbdadm up" now schedules "drbdmeta apply-al", which will apply
the activity log to the bitmap, and/or reinitialize it, if necessary,
as well as set a "clean" indicator flag.

This moves a bit code out of kernel space.
As a side effect, it also prevents some 8.3 module from accidentally
ignoring the 8.4 style activity log, if someone should downgrade,
whether on purpose, or accidentally because he changed kernel versions
without providing an 8.4 for the new kernel, and the new kernel comes
with in-tree 8.3.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:51 +01:00
Philipp Reisner
cdfda633d2 drbd: detach from frozen backing device
* drbd-8.3:
  documentation: Documented detach's --force and disk's --disk-timeout
  drbd: Implemented the disk-timeout option
  drbd: Force flag for the detach operation
  drbd: Allow new IOs while the local disk in in FAILED state
  drbd: Bitmap IO functions can not return prematurely if the disk breaks
  drbd: Added a kref to bm_aio_ctx
  drbd: Hold a reference to ldev while doing meta-data IO
  drbd: Keep a reference to the bio until the completion handler finished
  drbd: Implemented wait_until_done_or_disk_failure()
  drbd: Replaced md_io_mutex by an atomic: md_io_in_use
  drbd: moved md_io into mdev
  drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk
  drbd: Keep a reference to barrier acked requests

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:50 +01:00
Philipp Reisner
9510b2411d drbd: Fixed state transitions in case reading meta data failes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:50 +01:00
Philipp Reisner
2ffca4f3ee drbd: Improve compatibility with drbd's older than 8.3.7
Regression introduced with 8.3.11 commit:
drbd: Take a more conservative approach when deciding max_bio_size

Never ever tell an older drbd, that we support more than 32KiB
in a single data request (packet).
Never believe an older drbd, that is supports more than 32KiB
in a single data request (packet)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:49 +01:00
Andreas Gruenbacher
d0fa7fd680 drbd: Remove dead code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:47 +01:00
Andreas Gruenbacher
afbbfa88bc drbd: Allow to pass resource options to the new-resource command
This is equivalent to how the attach and connect commands work.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:46 +01:00
Andreas Gruenbacher
089c075d88 drbd: Convert the generic netlink interface to accept connection endpoints
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:46 +01:00
Andreas Gruenbacher
44e52cfaa2 drbd: Rename DRBD_ADM_NEED_{CONN -> RESOURCE}
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:46 +01:00
Andreas Gruenbacher
01b39b50d3 drbd: Split off netlink mandatory attribute handling into separate file
Duplicate this file in the kernel module and in user space; both sides need it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:45 +01:00
Andreas Gruenbacher
7c3063cc6f drbd: Also need to check for DRBD_GENLA_F_MANDATORY flags before nla_find_nested()
This is done by introducing drbd_nla_find_nested() which handles the flag
before calling nla_find_nested().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:45 +01:00
Andreas Gruenbacher
789c1b626c drbd: Use the terminology suggested by the command names in the source code and messages
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:44 +01:00
Lars Ellenberg
67b58bf723 drbd: spelling fix: too small
It is not "to small", but "too small".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:44 +01:00
Andreas Gruenbacher
c75b9b10e7 drbd: Don't use empty nested netlink attributes
Before mainline commit ea5693cc (v2.6.29-rc1), empty nested netlink attributes
were not allowed.  Fix that by leaving out nested attributes if they are empty
and by allowing the top-level attributes to be missing.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:57:40 +01:00
Andreas Gruenbacher
1e2a2551ee drbd: drbd_adm_prepare(): Pass through error codes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:56 +01:00
Philipp Reisner
d659f2aaea drbd: Send PROTOCOL_UPDATE packets when appropriate
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:54 +01:00
Philipp Reisner
dcb20d1a8e drbd: Refuse to change network options online when...
* the peer does not speak protocol_version 100 and the
  user wants to change one of:
    - wire_protocol
    - two_primaries
    - integrity_alg

* the user wants to remove the allow_two_primaries flag
  when there are two primaries

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:52 +01:00
Andreas Gruenbacher
95f8efd08b drbd: Fix the upper limit of resync-after
The 32-bit resync_after netlink field takes a device minor number as
parameter, which is no longer limited to 255.  We cannot statically
verify which device numbers are valid, so set the ummer limit to the
highest possible signed 32-bit integer.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:51 +01:00
Andreas Gruenbacher
6139f60dc1 drbd: Rename the want_lose field/flag to discard_my_data
This is what it is called in config files and on the command line as
well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:49 +01:00
Andreas Gruenbacher
6f9b5f84f5 drbd: Make broadcast events return NO_ERROR
Instead of returning a ret_code outside of the range of enum
drbd_ret_code, use NO_ERROR to indicate success.  This way,
ret_code has the same meaning in all packets.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:48 +01:00
Philipp Reisner
c141ebda03 drbd: Removing drbd_cfg_rwsem
* Updates to all configuration items is done under genl_lock().
   Including removal of mdevs or tconns.
 * All read non sleeping read sides are protected by rcu
 * All sleeping read sides keep reference counts to keep the
   objects alive

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:48 +01:00
Philipp Reisner
ec0bddbc55 drbd: Use RCU for the drbd_tconns list
Preparing removal of drbd_cfg_rwsem

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:47 +01:00
Philipp Reisner
81fa2e675c drbd: Refcounting for mdev objects
Preparing removal of drbd_cfg_rwsem

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:47 +01:00
Andreas Gruenbacher
e544046ab8 drbd: Turn no-md-flushes into md-flushes={yes|no}
Change the --no-md-flushes drbdsetup command line option as well as
the no_md_flush netlink packet.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:46 +01:00
Philipp Reisner
813472ced7 drbd: RCU for rs_plan_s
This removes the issue with using peer_seq_lock out of different
contexts.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:44 +01:00
Philipp Reisner
d589a21e5d drbd: Enforce limits of disk_conf members; centralized these checks
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:44 +01:00
Philipp Reisner
9958c857c7 drbd: Made the fifo object a self contained object (preparing for RCU)
* Moved rs_planed into it, named total
* When having a pointer to the object the values can
  be embedded into the fifo object.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:43 +01:00
Philipp Reisner
daeda1cca9 drbd: RCU for disk_conf
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:43 +01:00
Lars Ellenberg
563e4cf25e drbd: Introduce __s32_field in the genetlink macro magic
...and drop explicit typecasts (int)meta_dev_idx < 0.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:43 +01:00
Philipp Reisner
dc97b70801 drbd: Split drbd_alter_sa() into drbd_sync_after_valid() and drbd_sync_after_changed()
Preparing RCU for disk_conf

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:42 +01:00
Philipp Reisner
ef5e44a672 drbd: drbd_dew_dev_size() gets the user requests disk_size as argument
Preparing RCU for disk_conf

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:41 +01:00
Philipp Reisner
a0095508ca drbd: Renamed the net_conf_update mutex to conf_update
Preparing to use the same mutex for disk_conf updates

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:41 +01:00
Philipp Reisner
934e6138b5 drbd: Removed dead code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:40 +01:00
Andreas Gruenbacher
b966b5dd8e drbd: Generate the drbd_set_*_defaults() functions from drbd_genl.h
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:55:38 +01:00
Lars Ellenberg
009ba89db5 drbd: fix schedule in atomic
An administrative detach used to request a state change directly to D_DISKLESS,
first suspending IO to avoid the last put_ldev() occuring from an endio handler,
potentially in irq context.

This is not enough on the receiving side (typically secondary), we may miss
some peer_req on the way to local disk, which then may do the last put_ldev()
from their drbd_peer_request_endio().

This patch makes the detach always go through the intermediate D_FAILED state.
We may consider to rename it D_DETACHING.

Alternative approach would be to create yet an other work item to be scheduled
on the worker, do the destructor work from there, and get the timing right.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:53:01 +01:00
Lars Ellenberg
992d6e91d3 drbd: fix thread stop deadlock
There are races where the receiver may be exiting,
but still need the worker to process some stuff.

Do not wait for the receiver to die from an exiting worker.
The receiver must already be dead in case the worker decides to exit.
If the receiver was still alive, it may still want to queue work, and do
drbd_flush_workqueue() from it's disconnect cleanup code,
which would no longer be processed by an exiting worker.

This also would deadlock,
if the worker was to synchornously wait for the receiver to die.

Do not implicitly stop the worker.
The worker will only be stopped from configuration context, from
conn_reconfig_done(), drbd_adm_down() or drbd_adm_delete_connection(),
after making sure the receiver is already stopped.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:53:00 +01:00
Lars Ellenberg
f3dfa40a67 drbd: fix race when forcefully disconnecting
If a forced disconnect hits a restarting receiver right after it passed
its final "if (C_DISCONNECTING)" test in drbdd_init(), but before it was
actually restarted by drbd_thread_setup, we could be left with a
connection stuck in C_DISCONNECTING, never reaching C_STANDALONE,
which would be necessary to take it down or reconfigure it.

Move the last cleanup into w_after_conn_state_ch(), and do an additional
state change request in conn_try_disconnect(), just in case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:53:00 +01:00
Andreas Gruenbacher
88104ca458 drbd: Allow to change data-integrity-alg on the fly
The main purpose of this is to allow to turn data integrity checking on
and off on demand without causing interruptions.

Implemented by allocating tconn->peer_integrity_tfm only when receiving
a P_PROTOCOL message.  l accesses to tconn->peer_integrity_tf happen in
worker context, and no further synchronization is necessary.

On the sender side, tconn->integrity_tfm is modified under
tconn->data.mutex, and a P_PROTOCOL message is sent whenever.  All
accesses to tconn->integrity_tfm already happen under this mutex.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:59 +01:00
Andreas Gruenbacher
4b6ad6d457 drbd: Remove obsolete drbd_crypto_is_hash()
We allocate hash transformations with crypto_alloc_hash() which will
only return hash algorithms.  It is not necessary to reconfirm that we
actually got a hash algorithm.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:58 +01:00
Andreas Gruenbacher
5b614abe30 drbd: Rename integrity_r_tfm -> peer_integrity_tfm
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:58 +01:00
Andreas Gruenbacher
8d412fc6d5 drbd: Rename integrity_w_tfm -> integrity_tfm
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:58 +01:00
Lars Ellenberg
b57a1e27ee drbd: rename variable sc to res_opts
sc was short for syncer conf, which does not exist anymore anyways.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:56 +01:00
Lars Ellenberg
5ecc72c3b9 drbd: rename variable ndc to new_disk_conf
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:52:54 +01:00
Lars Ellenberg
5979e36155 drbd: on reconfiguration requests, mind the SET_DEFAULTS flag
The DRBD_GENL_F_SET_DEFAULTS flag was ignored
for drbd_adm_disk_opts() and drbd_adm_net_opts().

Factor out drbd_set_*_defaults() helper functions,
and call them appropriately.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:50:38 +01:00
Philipp Reisner
0fd0ea064c drbd: Consider all crypto options in connect and in net-options
So for this was simply not considered after the options have been
re-arranged.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:08 +01:00
Lars Ellenberg
d9cc6e2318 drbd: fix various disconnecting races
If an admin requests disconnect at a time when the state handling
already disconnects/reconnects, there have been some races.

Make sure to always really stop the network threads before
returning success for disconnect. Do not pretend successfull
forced disconnect, if the state handling returned an error.

Return success from drbd_adm_down() only after all threads are finished.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:08 +01:00
Lars Ellenberg
5ee743e92d drbd: remove useless kobject_uevent from drbd_adm_connect
Calling kobject_uevent, which may sleep, from within rcu_read_lock()
protected regions is not possible.
This particular kobject_uevent also is also wrong. It was supposed to
trigger a udev run, just in case something relevant to udev symlink
magic has changed, when adjusting runtime re-configurable settings while
we still had the "syncer conf".  It was improperly placed in connect
when we dropped the "syncer conf".  The right thing to do is probably to
call "udevadm trigger" directly in those cases where drbdadm thinks
there was a need to trigger extra udev runs.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:07 +01:00
Philipp Reisner
a18e9d1eb0 drbd: Removed the OBJECT_DYING and the CONFIG_PENDING bits
superseded by refcounting

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:07 +01:00
Philipp Reisner
0ace9dfabe drbd: Take a reference on tconn when finding a tconn by name
Rule #3 of kref.txt

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:06 +01:00
Philipp Reisner
9dc9fbb357 drbd: Basic refcounting for drbd_tconn
References hold by:
 * Each (running) drbd thread has a reference on tconn
 * Each mdev has a referenc on tconn
 * Beeing in the all_tconn list counts for one reference
 * Each after_conn_state_chg_work has a reference to tconn

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:06 +01:00
Lars Ellenberg
71932efc1c drbd: allow status dump request all volumes of a specific resource
We had drbd_adm_get_status (one single volume),
and drbd_adm_get_status_all (dump of all volumes of all resources).

This enhances the latter to be able to dump all volumes
of just one specific resource.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:04 +01:00
Philipp Reisner
91fd4dad64 drbd: Proper locking for updates to net_conf under RCU
Removing the get_net_conf()/put_net_conf() functions

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:49:03 +01:00
Philipp Reisner
44ed167da7 drbd: rcu_read_lock() and rcu_dereference() for tconn->net_conf
Removing the get_net_conf()/put_net_conf() calls

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:48:59 +01:00
Philipp Reisner
b032b6fa35 drbd: Allow online change of replication protocol only with agreed_pv >= 100
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:18 +01:00
Philipp Reisner
cd64397c0b drbd: Check consistency of net options when the get changed online
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:18 +01:00
Philipp Reisner
d3fcb4908d drbd: protect all idr accesses that might sleep with drbd_cfg_rwsem
With this commit the locking for all accesses to IDRs is complete:

 * Non sleeping read accesses are protected by RCU
 * sleeping read accesses are protocted by a read lock on drbd_cfg_rwsem
 * accesses that add anything are protected by a write lock
 * accesses that remove an object are protoected by a write lock
   and a call to synchronize_rcu() after it is removed from the IDR
   and before the object is actually free()ed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:17 +01:00
Philipp Reisner
ef35626284 drbd: Converted drbd_cfg_mutex into drbd_cfg_rwsem
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:17 +01:00
Philipp Reisner
695d08fa94 drbd: rcu_read_[un]lock() for all idr accesses that do not sleep
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:16 +01:00
Philipp Reisner
ff370e5a9e drbd: drbd_delete_device() takes a struct drbd_conf * now
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:15 +01:00
Andreas Gruenbacher
0c8e36d9b8 drbd: Introduce protocol version 100 headers
The 8 byte header finally becomes too small. With the protocol 100 header we
have 16 bit for the volume number, proper 32 bit for the data length, and
32 bit for further extensions in the future.

Previous versions of drbd are using version 80 headers for all packets
short enough for protocol 80.  They support both header versions in
worker context, but only version 80 headers in asynchronous context.
For backwards compatibility, continue to use version 80 headers for
short packets before protocol version 100.

From protocol version 100 on, use the same header version for all
packets.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:10 +01:00
Andreas Gruenbacher
da39fec492 drbd: Remove now-unused int_dig_out buffer
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:09 +01:00
Philipp Reisner
19f83c7661 drbd: Implemented conn_lowest_conn()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:05 +01:00
Philipp Reisner
da9fbc276e drbd: Introduced a new type union drbd_dev_state
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:04 +01:00
Philipp Reisner
2aebfabb17 drbd: Renamed id_susp(union drbd_state s) to drbd_suspended(struct drbd_conf *)
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:03 +01:00
Philipp Reisner
78bae59b1b drbd: Introduced drbd_read_state()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:45:03 +01:00
Philipp Reisner
cb703454a2 drbd: Converted drbd_try_outdate_peer() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:44:53 +01:00
Andreas Gruenbacher
22ab6a30b8 drbd: drbd_bm_read() never returns a positive value through drbd_bitmap_io()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:44:47 +01:00
Philipp Reisner
e90285e0ba drbd: Fixed conn_lowest_minor
It actually returned the lowest volume number. While doing that
renamed a few wrongly named variables.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:44:28 +01:00
Lars Ellenberg
f399002e68 drbd: distribute former syncer_conf settings to disk, connection, and resource level
This commit breaks the API again.

Move per-volume former syncer options into disk_conf.
Move per-connection former syncer options into net_conf.
Renamed the remainign sync_conf to res_opts

Syncer settings have been changeable at runtime, so we need to prepare
for these settings to be runtime-changeable in their new home as well.

Introduce new configuration operations, and share the netlink attribute
between "attach" (create new disk) and "disk-opts" (change options).
Same for "connect" and "net-opts".

Some fields cannot be changed at runtime, however.
Introduce a new flag GENLA_F_INVARIANT to be able to trigger on that in
the generated validation and assignment functions.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-08 16:44:20 +01:00
Philipp Reisner
6b75dced00 drbd: conn_khelper() for user mode callbacks for connections
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-04 00:16:32 +01:00
Lars Ellenberg
40cbf085f5 drbd: fix conn_reconfig_start without conn_reconfig_done in drbd_adm_attach
If drbd_adm_attach failed early, it left the CONFIG_PENDING bit on,
blocking any further conn_reconfig_start on that connection.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-04 00:16:31 +01:00
Lars Ellenberg
85f75dd763 drbd: introduce in-kernel "down" command
This greatly simplifies deconfiguration of whole resources.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-04 00:16:23 +01:00
Lars Ellenberg
527f4b24e5 drbd: bail out if a config requrest is over-determined, and not matching
We have resources resp. connections, volumes, and minor numbers.
A config request may specifies all three of them.
If it turns out that the minor belongs to a different connection, or a
different volume number in the same connection, that configuration
request is invalid.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-04 00:16:21 +01:00
Lars Ellenberg
38f19616d2 drbd: new-connection and new-minor succeed, if the object already exists
Follow O_CREAT semantics when creating connection or minor device/volume
objects.  If we need O_CREAT|O_EXCL semantics some time down the road,
we can add NLM_F_EXCL to the netlink message flags.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-04 00:16:21 +01:00
Lars Ellenberg
cffec5b2fe drbd: Allow a Diskless Secondary volume to be removed
Even if the connection is still established.
We should be able to reduce a volume from a replication group,
without taking the whole group offline.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-04 00:16:20 +01:00
Lars Ellenberg
543cc10b4c drbd: drbd_adm_get_status needs to show some more detail
We want to see existing connection objects, even if they do not
currently have volumes attached.

Change the .dumpit variant of drbd_adm_get_status to iterate not over
minor devices, but over connections + volumes.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-04 00:16:19 +01:00
Lars Ellenberg
8432b31457 drbd: allow holes in minor and volume id allocation
s/idr_get_new/idr_get_new_above/

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-04 00:16:17 +01:00
Lars Ellenberg
3b98c0c209 drbd: switch configuration interface from connector to genetlink
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-11-04 00:16:17 +01:00
Lars Ellenberg
a2a3c74f24 drbd: always write bitmap on detach
If we detach due to local read-error (which sets a bit in the bitmap),
stay Primary, and then re-attach (which re-reads the bitmap from disk),
we potentially lost the "out-of-sync" (or, "bad block") information in
the bitmap.

Always (try to) write out the changed bitmap pages before going diskless.

That way, we don't lose the bit for the bad block,
the next resync will fetch it from the peer, and rewrite
it locally, which may result in block reallocation in some
lower layer (or the hardware), and thereby "heal" the bad blocks.

If the bitmap writeout errors out as well, we will (again: try to)
mark the "we need a full sync" bit in our super block,
if it was a READ error; writes are covered by the activity log already.

If that superblock does not make it to disk either, we are sorry.

Maybe we just lost an entire disk or controller (or iSCSI connection),
and there actually are no bad blocks at all, so we don't need to
re-fetch from the peer, there is no "auto-healing" necessary.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-10-30 08:39:18 +01:00
Lars Ellenberg
06f10adbdb drbd: prepare for more than 32 bit flags
- struct drbd_conf { ... unsigned long flags; ... }
 + struct drbd_conf { ... unsigned long drbd_flags[N]; ... }

And introduce wrapper functions for test/set/clear bit operations
on this member.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-10-30 08:39:18 +01:00
Lars Ellenberg
02b91b5526 drbd: introduce stop-sector to online verify
We now can schedule only a specific range of sectors for online verify,
or interrupt a running verify without interrupting the connection.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-10-30 08:39:01 +01:00
Philipp Reisner
9f2247bb9b drbd: Protect accesses to the uuid set with a spinlock
There is at least the worker context, the receiver context, the context of
receiving netlink packts and processes reading a sysfs attribute that access
the uuids.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-10-30 08:39:01 +01:00
Philipp Reisner
d1aa4d04da drbd: Write all pages of the bitmap after an online resize
We need to write the whole bitmap after we moved the meta data
due to an online resize operation.

With the support for one peta byte devices bitmap IO was optimized
to only write out touched pages. This optimization must be turned
off when writing the bitmap after an online resize.

This issue was introduced with drbd-8.3.10.

The impact of this bug is that after an online resize, the next
resync could become larger than expected.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-08-16 17:17:35 +02:00
Lars Ellenberg
db141b2f42 drbd: fix max_bio_size to be unsigned
We capped our max_bio_size respectively max_hw_sectors with
min_t(int, lower level limit, our limit);
unfortunately, some drivers, e.g. the kvm virtio block driver, initialize their
limits to "-1U", and that is of course a smaller "int" value than our limit.

Impact: we started to request 16 MB resync requests,
which lead to protocol error and a reconnect loop.

Fix all relevant constants and parameters to be unsigned int.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-07-24 15:14:00 +02:00
Lars Ellenberg
7ee1fb93f3 drbd: flush drbd work queue before invalidate/invalidate remote
If you do back to back wait-sync/invalidate on a Primary in a tight loop,
during application IO load, you could trigger a race:
  kernel: block drbd6: FIXME going to queue 'set_n_write from StartingSync'
	but 'write from resync_finished' still pending?

Fix this by changing the order of the drbd_queue_work() and
the wake_up() in dec_ap_pending(), and adding the additional
drbd_flush_workqueue() before requesting the full sync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-07-24 14:15:58 +02:00
Lars Ellenberg
0029d62434 drbd: do not reset rs_pending_cnt too early
Fix asserts like
  block drbd0: in got_BlockAck:4634: rs_pending_cnt = -35 < 0 !

We reset the resync lru cache and related information (rs_pending_cnt),
once we successfully finished a resync or online verify, or if the
replication connection is lost.

We also need to reset it if a resync or online verify is aborted
because a lower level disk failed.

In that case the replication link is still established,
and we may still have packets queued in the network buffers
which want to touch rs_pending_cnt.

We do not have any synchronization mechanism to know for sure when all
such pending resync related packets have been drained.

To avoid this counter to go negative (and violate the ASSERT that it
will always be >= 0), just do not reset it when we lose a disk.

It is good enough to make sure it is re-initialized before the next
resync can start: reset it when we re-attach a disk.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-07-24 14:09:53 +02:00
Lars Ellenberg
c2ba686f35 drbd: report congestion if we are waiting for some userland callback
If the drbd worker thread is synchronously waiting for some userland
callback, we don't want some casual pageout to block on us.
Have drbd_congested() report congestion in that case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-07-24 14:07:18 +02:00
Lars Ellenberg
383606e0de drbd: differentiate between normal and forced detach
Aborting local requests (not waiting for completion from the lower level
disk) is dangerous: if the master bio has been completed to upper
layers, data pages may be re-used for other things already.
If local IO is still pending and later completes,
this may cause crashes or corrupt unrelated data.

Only abort local IO if explicitly requested.
Intended use case is a lower level device that turned into a tarpit,
not completing io requests, not even doing error completion.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-07-24 14:06:18 +02:00
Linus Torvalds
a70f35af4e Merge branch 'for-3.5/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "Here are the driver related changes for 3.5.  It contains:

   - The floppy changes from Jiri.  Jiri is now also marked as the
     maintainer of floppy.c, I shall be publically branding his forehead
     with red hot iron at the next opportune moment.

   - A batch of drbd updates and fixes from the linbit crew, as well as
     fixes from others.

   - Two small fixes for xen-blkfront courtesy of Jan."

* 'for-3.5/drivers' of git://git.kernel.dk/linux-block: (70 commits)
  floppy: take over maintainership
  floppy: remove floppy-specific O_EXCL handling
  floppy: convert to delayed work and single-thread wq
  xen-blkfront: module exit handling adjustments
  xen-blkfront: properly name all devices
  drbd: grammar fix in log message
  drbd: check MODULE for THIS_MODULE
  drbd: Restore the request restart logic
  drbd: introduce a bio_set to allocate housekeeping bios from
  drbd: remove unused define
  drbd: bm_page_async_io: properly initialize page->private
  drbd: use the newly introduced page pool for bitmap IO
  drbd: add page pool to be used for meta data IO
  drbd: allow bitmap to change during writeout from resync_finished
  drbd: fix race between drbdadm invalidate/verify and finishing resync
  drbd: fix resend/resubmit of frozen IO
  drbd: Ensure that data_size is not 0 before using data_size-1 as index
  drbd: Delay/reject other state changes while establishing a connection
  drbd: move put_ldev from __req_mod() to the endio callback
  drbd: fix WRITE_ACKED_BY_PEER_AND_SIS to not set RQ_NET_DONE
  ...
2012-05-30 09:05:47 -07:00
Eric W. Biederman
38bf195398 connector/userns: replace netlink uses of cap_raised() with capable()
In 2009 Philip Reiser notied that a few users of netlink connector
interface needed a capability check and added the idiom
cap_raised(nsp->eff_cap, CAP_SYS_ADMIN) to a few of them, on the premise
that netlink was asynchronous.

In 2011 Patrick McHardy noticed we were being silly because netlink is
synchronous and removed eff_cap from the netlink_skb_params and changed
the idiom to cap_raised(current_cap(), CAP_SYS_ADMIN).

Looking at those spots with a fresh eye we should be calling
capable(CAP_SYS_ADMIN).  The only reason I can see for not calling capable
is that it once appeared we were not in the same task as the caller which
would have made calling capable() impossible.

In the initial user_namespace the only difference between between
cap_raised(current_cap(), CAP_SYS_ADMIN) and capable(CAP_SYS_ADMIN) are a
few sanity checks and the fact that capable(CAP_SYS_ADMIN) sets
PF_SUPERPRIV if we use the capability.

Since we are going to be using root privilege setting PF_SUPERPRIV seems
the right thing to do.

The motivation for this that patch is that in a child user namespace
cap_raised(current_cap(),...) tests your capabilities with respect to that
child user namespace not capabilities in the initial user namespace and
thus will allow processes that should be unprivielged to use the kernel
services that are only protected with cap_raised(current_cap(),..).

To fix possible user_namespace issues and to just clean up the code
replace cap_raised(current_cap(), CAP_SYS_ADMIN) with
capable(CAP_SYS_ADMIN).

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-10 23:21:39 -04:00
Lars Ellenberg
a574daf5d7 drbd: fix race between drbdadm invalidate/verify and finishing resync
When a resync or online verify is finished or aborted,
drbd does a bulk write-out of changed bitmap pages.

If *in that very moment* a new verify or resync is triggered,
this can race:
 ASSERT( !test_bit(BITMAP_IO, &mdev->flags) ) in drbd_main.c
 FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending?
and similar.

This can be observed with e.g. tight invalidate loops in test scripts,
and probably has no real-life implication.

Still, that race can be solved by first quiescen the device,
before starting a new resync or verify.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:59 +02:00
Philipp Reisner
197296ffed drbd: Delay/reject other state changes while establishing a connection
Changes to the role and disk state should be delayed or rejected
while we establish a connection.

This is necessary, since the peer will base its resync decision
on the UUIDs and the state we sent in the drbd_connect() function.

The most prominent example for this race is becoming primary after
sending state and UUIDs and before the state changes to C_WF_CONNECTION.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:55 +02:00
Lars Ellenberg
f479ea0661 drbd: send intermediate state change results to the peer
DRBD state changes schedule after_state_ch() actions to a worker thread,
which decides on the old and new states of that change, whether to send
an informational state update packet (P_STATE) to the peer.
If it decides to drbd_send_state(), it would however always send the
_curent_ state, which, if a second state change happens before the
after_state_ch() of the first ran, may "fast-forward" the peer's view
about this node.  In most cases that is harmless, but sometimes this can
confuse DRBD, for example into not actually starting a necessary resync
if you do a very tight detach/attach loop on a Connected Secondary.

Fix this by always sending the "new" state of the respective state
transition which scheduled this after_state_ch() work.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:56 +02:00
Lars Ellenberg
a2e9138197 drbd: fix spurious meta data IO "error"
When detaching, even cleanly detaching due to administrator request,
we always go through D_FAILED before we become D_DISKLESS.

Don't let that state change race with an in-flight meta data IO,
or that one might think it actually experienced an IO error.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:54 +02:00
Andreas Gruenbacher
7b4e4d3126 drbd: drbd_nl_resize(): Fix missing put_ldev() on error path
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:49 +02:00
Philipp Reisner
02ee8f95fa drbd: Force flag for the detach operation
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:10:38 +02:00
Philipp Reisner
6809384c71 drbd: Improve compatibility with drbd's older than 8.3.7
Regression introduced with 8.3.11 commit:
drbd: Take a more conservative approach when deciding max_bio_size

Never ever tell an older drbd, that we support more than 32KiB
in a single data request (packet).
Never believe an older drbd, that is supports more than 32KiB
in a single data request (packet)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:08:57 +02:00
Lars Ellenberg
7948bcdc38 drbd: spelling fix: too small
It is not "to small", but "too small".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:02:22 +02:00
Oleg Nesterov
70834d3070 usermodehelper: use UMH_WAIT_PROC consistently
A few call_usermodehelper() callers use the hardcoded constant instead of
the proper UMH_WAIT_PROC, fix them.

Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:41 -07:00
Cong Wang
cfd8005c99 block: remove the second argument of k[un]map_atomic()
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:16 +08:00
Philipp Reisner
774b305518 drbd: Implemented new commands to create/delete connections/minors
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:00 +02:00
Philipp Reisner
80883197da drbd: Converted drbd_nl_(net_conf|disconnect)() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:00 +02:00
Philipp Reisner
1aba4d7fcf drbd: Preparing the connector interface to operator on connections
Up to now it only operated on minor numbers. Now it can work also
on named connections.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:59 +02:00
Philipp Reisner
2f5cdd0b2c drbd: Converted the transfer log from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:58 +02:00
Philipp Reisner
3f9cbe937e drbd: Removed the mdev parameter from the ..to_tags() and ...from_tags() functions
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:56 +02:00
Philipp Reisner
0e29d163f7 drbd: Reworked the unconfiguring and thread stopping code
* Moved CONFIG_PENDING and DEVICE_DYING from mdev to tconn.
* Renamed drbd_reconfig_start() and drbd_reconfig_done() to
  conn_reconfig_start() and conn_reconfig_done().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:55 +02:00
Andreas Gruenbacher
8ccf218e9f drbd: Replace atomic_add_return with atomic_inc_return
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:50 +02:00
Lars Ellenberg
7ad651b522 drbd: new on-disk activity log transaction format
Use a new on-disk transaction format for the activity log, which allows
for multiple changes to the active set per transaction.

Using 4k transaction blocks, we can now get rid of the work-around code
to deal with devices not supporting 512 byte logical block size.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:46 +02:00
Lars Ellenberg
46a15bc3ec lru_cache: allow multiple changes per transaction
Allow multiple changes to the active set of elements in lru_cache.
The only current user of lru_cache, drbd, is driving this generalisation.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:45 +02:00
Lars Ellenberg
61610420f7 drbd: in drbd_suspend_al, set AL_SUSPENDED before unlocking the activity log
As using an empty activity log is the whole point of the excercise,
make sure it is still empty when setting AL_SUSPENDED.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:41 +02:00
Philipp Reisner
df24aa45f4 drbd: Implemented connection wide state changes
That is used for graceful disconnect only

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:32 +02:00
Philipp Reisner
8410da8f0e drbd: Introduced tconn->cstate_mutex
In compatibility mode with old DRBDs, use that as the state_mutex
as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:01 +02:00
Philipp Reisner
bbeb641c3e drbd: Killed volume0; last step of multi-volume-enablement
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:44:58 +02:00
Philipp Reisner
a21e929827 drbd: Moved the mdev member into drbd_work (from drbd_request and drbd_peer_request)
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:08 +02:00
Philipp Reisner
808222845d drbd: Converted drbd_calc_cpu_mask() and drbd_thread_current_set_cpu() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:04 +02:00
Philipp Reisner
0625ac190d drbd: Converted wake_asender() and request_ping() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:58 +02:00
Philipp Reisner
25703f8320 drbd: Moved DISCARD_CONCURRENT to the per connection (tconn) flags
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:55 +02:00
Andreas Gruenbacher
db830c464b drbd: Local variable renames: e -> peer_req
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:42 +02:00
Andreas Gruenbacher
f6ffca9f42 drbd: Rename struct drbd_epoch_entry to struct drbd_peer_request
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:39 +02:00
Jiri Kosina
e060c38434 Merge branch 'master' into for-next
Fast-forward merge with Linus to be able to merge patches
based on more recent version of the tree.
2011-09-15 15:08:18 +02:00
Joe Perches
1d273b929c drbd: Use angle brackets for system includes
Use the normal include style.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-15 14:02:57 +02:00
Philipp Reisner
191d3cc8d9 drbd: Made drbd_flush_workqueue() to take a tconn instead of an mdev
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:30:24 +02:00
Philipp Reisner
a0638456c6 drbd: moved crypto transformations and friends from mdev to tconn
sed -i \
       -e 's/mdev->cram_hmac_tfm/mdev->tconn->cram_hmac_tfm/g' \
       -e 's/mdev->integrity_w_tfm/mdev->tconn->integrity_w_tfm/g' \
       -e 's/mdev->integrity_r_tfm/mdev->tconn->integrity_r_tfm/g' \
       -e 's/mdev->int_dig_out/mdev->tconn->int_dig_out/g' \
       -e 's/mdev->int_dig_in/mdev->tconn->int_dig_in/g' \
       -e 's/mdev->int_dig_vv/mdev->tconn->int_dig_vv/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:30:23 +02:00
Philipp Reisner
87eeee41f8 drbd: moved req_lock and transfer log from mdev to tconn
sed -i \
       -e 's/mdev->req_lock/mdev->tconn->req_lock/g' \
       -e 's/mdev->unused_spare_tle/mdev->tconn->unused_spare_tle/g' \
       -e 's/mdev->newest_tle/mdev->tconn->newest_tle/g' \
       -e 's/mdev->oldest_tle/mdev->tconn->oldest_tle/g' \
       -e 's/mdev->out_of_sequence_requests/mdev->tconn->out_of_sequence_requests/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:30:15 +02:00
Philipp Reisner
31890f4ab2 drbd: moved agreed_pro_version, last_received and ko_count to tconn
sed -i \
       -e 's/mdev->agreed_pro_version/mdev->tconn->agreed_pro_version/g' \
       -e 's/mdev->last_received/mdev->tconn->last_received/g' \
       -e 's/mdev->ko_count/mdev->tconn->ko_count/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:07 +02:00
Philipp Reisner
e6b3ea83bc drbd: moved receiver, worker and asender from mdev to tconn
Patch mostly:
sed -i -e 's/mdev->receiver/mdev->tconn->receiver/g' \
       -e 's/mdev->worker/mdev->tconn->worker/g' \
       -e 's/mdev->asender/mdev->tconn->asender/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:06 +02:00
Philipp Reisner
b2fb6dbe52 drbd: moved net_cont and net_cnt_wait from mdev to tconn
Patch partly generated by:

sed -i -e 's/get_net_conf(mdev)/get_net_conf(mdev->tconn)/g' \
       -e 's/put_net_conf(mdev)/put_net_conf(mdev->tconn)/g' \
       -e 's/get_net_conf(odev)/get_net_conf(odev->tconn)/g' \
       -e 's/put_net_conf(odev)/put_net_conf(odev->tconn)/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:04 +02:00
Philipp Reisner
89e58e755e drbd: moved net_conf from mdev to tconn
Besides moving the struct member, everything else is generated by:

sed -i -e 's/mdev->net_conf/mdev->tconn->net_conf/g' \
       -e 's/odev->net_conf/odev->tconn->net_conf/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:03 +02:00
Andreas Gruenbacher
841ce241fa drbd: Replace the ERR_IF macro with an assert-like macro
Remove the file name and line number from the syslog messages generated:
we have no duplicate function names, and no function contains the same
assertion more than once.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:57 +02:00
Andreas Gruenbacher
8554df1c6d drbd: Convert all constants in enum drbd_req_event to upper case
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:55 +02:00
Andreas Gruenbacher
bb3bfe9614 drbd: Remove the unused hash tables
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:54 +02:00
Andreas Gruenbacher
010f6e678f drbd: Put sector and size in struct drbd_epoch_entry into struct drbd_interval
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:52 +02:00
H Hartley Sweeten
ddad9ef582 drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse
The buffer 'sc.cpu_mask' is a kernel buffer.  If bitmap_parse is used
instead of __bitmap_parse the extra parameter that indicates a kernel
buffer is not needed.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-02 12:43:49 +02:00
Philipp Reisner
9b2f61aec7 drbd: fix warning
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-05-24 10:38:32 +02:00
Bart Van Assche
24c4830c8e drbd: Fix spelling
Found these with the help of ispell -l.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-05-24 10:21:29 +02:00
Lars Ellenberg
9a0d9d0389 drbd: fix schedule in atomic
An administrative detach used to request a state change directly to D_DISKLESS,
first suspending IO to avoid the last put_ldev() occuring from an endio handler,
potentially in irq context.

This is not enough on the receiving side (typically secondary), we may miss
some peer_req on the way to local disk, which then may do the last put_ldev()
from their drbd_peer_request_endio().

This patch makes the detach always go through the intermediate D_FAILED state.
We may consider to rename it D_DETACHING.

Alternative approach would be to create yet an other work item to be scheduled
on the worker, do the destructor work from there, and get the timing right.

manually picked commit 564040f from the drbd 8.4 branch.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:14:32 +02:00
Philipp Reisner
99432fcc52 drbd: Take a more conservative approach when deciding max_bio_size
The old (optimistic) implementation could shrink the bio size
on an primary device.

Shrinking the bio size on a primary device is bad. Since there
we might get BIOs with the old (bigger) size shortly after
we published the new size.

The new implementation is more conservative, and eventually
increases the max_bio_size on a primary device (which is valid).
It does so, when it knows the local limit AND the remote limit.

 We cache the last seen max_bio_size of the peer in the meta
 data, and rely on that, to make the operation of single
 nodes more efficient.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:08:58 +02:00
Philipp Reisner
21423fa791 drbd: Fixed state transitions after async outdate-peer-handler returned
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:08:11 +02:00
Linus Torvalds
8d49a77568 Merge branch 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block: (122 commits)
  cciss: fix lost command issue
  drbd: need include for bitops functions declarations
  Revert "cciss: Add missing allocation in scsi_cmd_stack_setup and  corresponding deallocation"
  cciss: fix missed command status value CMD_UNABORTABLE
  cciss: remove unnecessary casts
  cciss: Mask off error bits of c->busaddr in cmd_special_free when calling pci_free_consistent
  cciss: Inform controller we are using 32-bit tags.
  cciss: hoist tag masking out of loop
  cciss: Add missing allocation in scsi_cmd_stack_setup and  corresponding deallocation
  cciss: export resettable host attribute
  drbd: drop code present under #ifdef which is relevant to 2.6.28 and below
  drbd: Fixed handling of read errors on a 'VerifyS' node
  drbd: Fixed handling of read errors on a 'VerifyT' node
  drbd: Implemented real timeout checking for request processing time
  drbd: Remove unused function atodb_endio()
  drbd: improve log message if received sector offset exceeds local capacity
  drbd: kill dead code
  drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails
  drbd: Removed left over, now wrong comments
  drbd: serialize admin requests for new verify run with pending bitmap io
  ...
2011-03-27 20:02:07 -07:00
Lars Ellenberg
873b0d5f98 drbd: serialize admin requests for new verify run with pending bitmap io
This is an addendum to
 drbd: serialize admin requests for new resync with pending bitmap io

It avoids a race that could trigger "FIXME" assert log messages.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:07 +01:00
Lars Ellenberg
20ceb2b22e drbd: describe bitmap locking for bulk operation in finer detail
Now that we do no longer in-place endian-swap the bitmap, we allow
selected bitmap operations (testing bits, sometimes even settting bits)
during some bulk operations.

This caused us to hit a lot of FIXME asserts similar to
	FIXME asender in drbd_bm_count_bits,
	bitmap locked for 'write from resync_finished' by worker
Which now is nonsense: looking at the bitmap is perfectly legal
as long as it is not being resized.

This cosmetic patch defines some flags to describe expectations in finer
detail, so the asserts in e.g. bm_change_bits_to() can be skipped if
appropriate.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:02 +01:00
Lars Ellenberg
62b0da3a24 drbd: log UUIDs whenever they change
All decisions about sync, sync direction, and wether or not to
allow a connect or attach are based on our set of UUIDs to tag a
data generation.

Log changes to the UUIDs whenever they occur,
logging "new current UUID P:Q:R:S" is more useful
than "Creating new current UUID".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:01 +01:00