Commit Graph

1849 Commits

Author SHA1 Message Date
Darik Horn
27dd0f3a07 Add machine readable debian/copyright file.
Update the copyright file for DEP-5 policy conformance:

  http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
2012-02-28 13:30:50 -06:00
Darik Horn
5605cf4f73 PPA 0.6.0.52-0ubuntu1 release. 2012-02-27 19:36:21 -06:00
Darik Horn
8591ebc31d Refresh debian/patches after upstream merge. 2012-02-27 19:35:05 -06:00
Darik Horn
1083ff73df Merge branch 'upstream' 2012-02-27 19:33:06 -06:00
Brian Behlendorf
4b787d75c8 Cleanly support debug packages
Allow a source rpm to be rebuilt with debugging enabled.  This
avoids the need to have to manually modify the spec file.  By
default debugging is still largely disabled.  To enable specific
debugging features use the following options with rpmbuild.

  '--with debug'               - Enables ASSERTs

  # For example:
  $ rpmbuild --rebuild --with debug zfs-modules-0.6.0-rc6.src.rpm

Additionally, ZFS_CONFIG has been added to zfs_config.h for
packages which build against these headers.  This is critical
to ensure both zfs and the dependant package are using the same
prototype and structure definitions.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-27 14:08:17 -08:00
Brian Behlendorf
570827e129 Add 'dmu_tx' kstats entry
Keep counters for the various reasons that a thread may end up
in txg_wait_open() waiting on a new txg.  This can be useful
when attempting to determine why a particular workload is
under performing.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-27 08:59:10 -08:00
Brian Behlendorf
13be560d89 Add arc_state_t stats to arcstats
To ensure the arc is behaving properly we need greater visibility
in to exactly how it's managing the systems memory.  This patch
takes one step in that direction be adding the current arc_state_t
for the anon, mru, mru_ghost, mfu, and mfs_ghost lists.  The l2
arc_state_t is already well represented in the arcstats.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-27 08:58:59 -08:00
Ned Bass
3a4f6caf08 Return success from check_slice() if device doesn't exist
When creating a new pool, make_root_vdev() calls check_in_use() to
ensure that none of the consituent disks are in use.  If the disk
contains a valid vdev label it is read to retrieve the list of its
child vdevs and these are checked recursively.  However, the
partitions stored in the vdev label my no longer exist, for example
if the partition table has since been altered.  In any such case we
would want the pool creation to proceed, so this change removes the
check from check_slice() that returns an error if the device doesn't
exist.  As an added assurance, the Solaris implementation also
returns sucess on ENOENT.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-27 08:52:38 -08:00
Darik Horn
84cea15a33 Change POST_INSTALL to POST_BUILD in dkms.conf
Install zfs_config.h and Module.symvers earlier so that they can be
used by consumers before the zfs module is activated by dkms.

See also dajhorn/pkg-spl@a68ebb5f24.
2012-02-25 12:10:04 -06:00
Alex Zhuravlev
a473d90cee Export symbols for zero-copy
Export additional symbols to make use of the DMU's zero-copy
API.  This allows external modules to move data in to and out of
the ARC without incurring the cost of a memory copy.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-17 12:43:02 -08:00
Richard Yao
b41c9906dc Support ashift=13 for 8KB SSD block sizes
New SSDs are now available which use an internal 8k block size.
To make sure ZFS can get the maximum performance out of these
devices we're increasing the maximum ashift to 13 (8KB).

This value is still small enough that we can fit 16 uberblocks
in the vdev ring label.  However, I don't want to increase this
any futher or it will limit the ability the safely roll back a
pool to recover it.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #565
2012-02-13 12:25:27 -08:00
Turbo Fredriksson
d2e032ca9c Add 'fsid' mount option to allowed options.
Resolves nfs-utils-1.0.x compatibility issue which requires
that the fsid be set in the export options.

  exportfs: Warning: /tank/dir requires fsid= for NFS export

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #570
2012-02-13 09:43:57 -08:00
Darik Horn
72e2ca117d PPA 0.6.0.51-0ubuntu1 release. 2012-02-12 19:30:21 -06:00
Darik Horn
f6a47a52be Refresh debian/patches after upstream merge. 2012-02-12 19:27:35 -06:00
Darik Horn
c8257dbc27 Merge branch 'upstream' 2012-02-12 19:18:01 -06:00
Brian Behlendorf
b10c77f70a Export symbols for zero-copy
Exported the required symbols to make use of the DMU's zero-copy
API.  This allows external modules to move data in to and out of
the ARC without incurring the cost of a memory copy.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-10 11:56:55 -08:00
Brian Behlendorf
a31acb462d Use spl_debug_* helpers
When configuring the spl debug log support use the provided wrapper
functions.  This ensures that if --disable-debug-log was used when
buiding the spl the functions will have no effect.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-09 16:37:48 -08:00
Etienne Dechamps
30930fba21 Add support for DISCARD to ZVOLs.
DISCARD (REQ_DISCARD, BLKDISCARD) is useful for thin provisioning.
It allows ZVOL clients to discard (unmap, trim) block ranges from
a ZVOL, thus optimizing disk space usage by allowing a ZVOL to
shrink instead of just grow.

We can't use zfs_space() or zfs_freesp() here, since these functions
only work on regular files, not volumes. Fortunately we can use the
low-level function dmu_free_long_range() which does exactly what we
want.

Currently the discard operation is not added to the log. That's not
a big deal since losing discard requests cannot result in data
corruption. It would however result in disk space usage higher than
it should be. Thus adding log support to zvol_discard() is probably
a good idea for a future improvement.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-09 16:19:38 -08:00
Etienne Dechamps
cb2d19010d Support the fallocate() file operation.
Currently only the (FALLOC_FL_PUNCH_HOLE) flag combination is
supported, since it's the only one that matches the behavior of
zfs_space(). This makes it pretty much useless in its current
form, but it's a start.

To support other flag combinations we would need to modify
zfs_space() to make it more flexible, or emulate the desired
functionality in zpl_fallocate().

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #334
2012-02-09 16:19:32 -08:00
Etienne Dechamps
aec69371a6 Check permissions in zfs_space().
This isn't done on Solaris because on this OS zfs_space() can
only be called with an opened file handle. Since the addition of
zpl_truncate_range() this isn't the case anymore, so we need to
enforce access rights.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #334
2012-02-09 15:20:37 -08:00
Etienne Dechamps
5cb63a57f8 Implement the truncate_range() inode operation.
This operation allows "hole punching" in ZFS files. On Solaris this
is done via the vop_space() system call, which maps to the zfs_space()
function. So we just need to write zpl_truncate_range() as a wrapper
around zfs_space().

Note that this only works for regular files, not ZVOLs.

This is currently an insecure implementation without permission
checking, although this isn't that big of a deal since truncate_range()
isn't even callable from userspace.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #334
2012-02-09 15:20:32 -08:00
Brian Behlendorf
93648f314c Fix zconfig.sh non-optimal alignment
The recent zvol improvements have changed default suggested alignment
for zvols from 512b (default) to 8k (zvol blocksize).  Because of this
the zconfig.sh tests which create paritions are now generating a
warning about non-optimal alignments.

This change updates the need zconfig.sh tests such that a partition
will be properly aligned.  In the process, it shifts from using the
sfdisk utility to the parted utility to create partitions.  It also
moves the creation of labels, partitions, and filesystems in to
generic functions in common.sh.in.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-09 13:23:28 -08:00
Darik Horn
67fef9ae39 PPA 0.6.0.50-0ubuntu1 release. 2012-02-08 17:57:38 -06:00
Darik Horn
c52a9cc103 Refresh debian/patches after upstream merge. 2012-02-08 17:55:26 -06:00
Darik Horn
555c3c0d82 Merge branch 'upstream' 2012-02-08 17:52:00 -06:00
Etienne Dechamps
dde9380a1b Use 32 as the default number of zvol threads.
Currently, the `zvol_threads` variable, which controls the number of worker
threads which process items from the ZVOL queues, is set to the number of
available CPUs.

This choice seems to be based on the assumption that ZVOL threads are
CPU-bound. This is not necessarily true, especially for synchronous writes.
Consider the situation described in the comments for `zil_commit()`, which is
called inside `zvol_write()` for synchronous writes:

> itxs are committed in batches. In a heavily stressed zil there will be a
> commit writer thread who is writing out a bunch of itxs to the log for a
> set of committing threads (cthreads) in the same batch as the writer.
> Those cthreads are all waiting on the same cv for that batch.
>
> There will also be a different and growing batch of threads that are
> waiting to commit (qthreads). When the committing batch completes a
> transition occurs such that the cthreads exit and the qthreads become
> cthreads. One of the new cthreads becomes he writer thread for the batch.
> Any new threads arriving become new qthreads.

We can easily deduce that, in the case of ZVOLs, there can be a maximum of
`zvol_threads` cthreads and qthreads. The default value for `zvol_threads` is
typically between 1 and 8, which is way too low in this case. This means
there will be a lot of small commits to the ZIL, which is very inefficient
compared to a few big commits, especially since we have to wait for the data
to be on stable storage. Increasing the number of threads will increase the
amount of data waiting to be commited and thus the size of the individual
commits.

On my system, in the context of VM disk image storage (lots of small
synchronous writes), increasing `zvol_threads` from 8 to 32 results in a 50%
increase in sequential synchronous write performance.

We should choose a more sensible default for `zvol_threads`. Unfortunately
the optimal value is difficult to determine automatically, since it depends
on the synchronous write latency of the underlying storage devices. In any
case, a hardcoded value of 32 would probably be better than the current
situation. Having a lot of ZVOL threads doesn't seem to have any real
downside anyway.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Fixes #392
2012-02-08 13:58:10 -08:00
Etienne Dechamps
34037afe24 Improve ZVOL queue behavior.
The Linux block device queue subsystem exposes a number of configurable
settings described in Linux block/blk-settings.c. The defaults for these
settings are tuned for hard drives, and are not optimized for ZVOLs. Proper
configuration of these options would allow upper layers (I/O scheduler) to
take better decisions about write merging and ordering.

Detailed rationale:

 - max_hw_sectors is set to unlimited (UINT_MAX). zvol_write() is able to
   handle writes of any size, so there's no reason to impose a limit. Let the
   upper layer decide.

 - max_segments and max_segment_size are set to unlimited. zvol_write() will
   copy the requests' contents into a dbuf anyway, so the number and size of
   the segments are irrelevant. Let the upper layer decide.

 - physical_block_size and io_opt are set to the ZVOL's block size. This
   has the potential to somewhat alleviate issue #361 for ZVOLs, by warning
   the upper layers that writes smaller than the volume's block size will be
   slow.

 - The NONROT flag is set to indicate this isn't a rotational device.
   Although the backing zpool might be composed of rotational devices, the
   resulting ZVOL often doesn't exhibit the same behavior due to the COW
   mechanisms used by ZFS. Setting this flag will prevent upper layers from
   making useless decisions (such as reordering writes) based on incorrect
   assumptions about the behavior of the ZVOL.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-07 16:23:06 -08:00
Etienne Dechamps
b18019d2d8 Fix synchronicity for ZVOLs.
zvol_write() assumes that the write request must be written to stable storage
if rq_is_sync() is true. Unfortunately, this assumption is incorrect. Indeed,
"sync" does *not* mean what we think it means in the context of the Linux
block layer. This is well explained in linux/fs.h:

    WRITE:       A normal async write. Device will be plugged.
    WRITE_SYNC:  Synchronous write. Identical to WRITE, but passes down
                 the hint that someone will be waiting on this IO
                 shortly.
    WRITE_FLUSH: Like WRITE_SYNC but with preceding cache flush.
    WRITE_FUA:   Like WRITE_SYNC but data is guaranteed to be on
                 non-volatile media on completion.

In other words, SYNC does not *mean* that the write must be on stable storage
on completion. It just means that someone is waiting on us to complete the
write request. Thus triggering a ZIL commit for each SYNC write request on a
ZVOL is unnecessary and harmful for performance. To make matters worse, ZVOL
users have no way to express that they actually want data to be written to
stable storage, which means the ZIL is broken for ZVOLs.

The request for stable storage is expressed by the FUA flag, so we must
commit the ZIL after the write if the FUA flag is set. In addition, we must
commit the ZIL before the write if the FLUSH flag is set.

Also, we must inform the block layer that we actually support FLUSH and FUA.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-07 16:23:06 -08:00
Etienne Dechamps
56c34bac44 Support "sync=always" for ZVOLs.
Currently the "sync=always" property works for regular ZFS datasets, but not
for ZVOLs. This patch remedies that.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Fixes #374.
2012-02-07 16:23:06 -08:00
Darik Horn
e67329d8e0 Let libnvpair be linked independently of libzfs.
Autoconf will fail to detect the ZoL libnvpair on systems that do not
implicitly link library runtime dependencies, which is anything that
has the GCC 4.5 DCO update.

Build libuutil before libnvpair, and put it on the the LDADD line of
the libnvpair automake template.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes: #560
2012-02-07 11:37:15 -08:00
Darik Horn
8c0719e1e4 PPA 0.6.0.49-0ubuntu1 release. 2012-02-03 15:47:26 -06:00
Darik Horn
014b314281 Add patch: Improve the --with-spl error
If the SPL module is unavailable, then the operator gets an error
message that does not apply to installations that are mananged by
DKMS. Change the error message to fit apt-get systems.

A better solution would be to implement a dependency model in DKMS,
or to bundle SPL into the ZFS package.
2012-02-03 15:47:21 -06:00
Darik Horn
3f19360900 Refresh patches after upstream merge. 2012-02-03 15:37:53 -06:00
Darik Horn
c05be9c252 Merge branch 'upstream' 2012-02-03 14:57:14 -06:00
Darik Horn
dc496da23a Revert "Manually sync scripts/ with the upstream release."
This reverts commit ae1cd09777
to keep pkg-zfs limited to the debian/ overlay.

These symlinks were dereferenced in pkg-zfs so that the zfs-linux
source package could be diffed directly onto the vanilla upstream
tarball, but the orig tarball produced by git-buildpackage from the
upstream/* git tags is now being published to the PPA instead.
2012-02-03 13:42:02 -06:00
Darik Horn
85e250091c Revert lib/libspl/asm-generic/atomic.S deletion.
The atomic.S file was deleted when pkg-zfs was converted to a
git-buildpackage project more than a year ago, and I don't remember
whether the deletion was pertinent or accidental.

Regardless, this is a packaging policy violation because it touches a
file outside of the debian/ overlay, and it is not currently required
for building pkg-zfs.

Restore the atomic.S file by revering commit
pkg-zfs/dajhorn@7e4739a203
"Initial master branch for git-buildpackage."
and discarding all other merge conflicts.
2012-02-03 12:52:53 -06:00
Brian Behlendorf
47621f3d76 Linux 3.3 compat, sops->show_options()
The second argument of sops->show_options() was changed from a
'struct vfsmount *' to a 'struct dentry *'.  Add an autoconf check
to detect the API change and then conditionally define the expected
interface.  In either case we are only interested in the zfs_sb_t.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #549
2012-02-03 10:02:01 -08:00
Brian Behlendorf
d7e398ce1a Cleanup ZFS debug infrastructure
Historically the internal zfs debug infrastructure has been
scattered throughout the code.  Since we expect to start making
more use of this code this patch performs some cleanup.

* Consolidate the zfs debug infrastructure in the zfs_debug.[ch]
  files.  This includes moving the zfs_flags and zfs_recover
  variables, plus moving the zfs_panic_recover() function.

* Remove the existing unused functionality in zfs_debug.c and
  replace it with code which correctly utilized the spl logging
  infrastructure.

* Remove the __dprintf() function from zfs_ioctl.c.  This is
  dead code, the dprintf() functionality in the kernel relies
  on the spl log support.

* Remove dprintf() from hdr_recl().  This wasn't particularly
  useful and was missing the required format specifier anyway.

* Subsequent patches should unify the dprintf() and zfs_dbgmsg()
  functions.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-02 11:24:30 -08:00
Brian Behlendorf
0c5dde492f Allow multiple values per directory entry
When using zfs to back a Lustre filesystem it's advantageous to
to store a fid with the object id in the directory zap.  The only
technical impediment to doing this is that the zpl code expects
a single value in the zap per directory entry.

This change relaxes that requirement such that multiple entries
are allowed provided the first one is the object id.  The zpl
code will just ignore additional entries.  This allows the ZoL
count to mount datasets which are being used as Lustre server
backends.

Once the upstream feature flags support is merged in this change
should be updated to a read-only feature.  Until this occurs
other zfs implementations will not be able to read the zfs
filesystems created by Lustre.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-02-02 11:22:08 -08:00
Brian Behlendorf
e29be02e46 Export symbol zfs_attr_table
Export the zfs_attr_table symbol so it may be used by non-zpl
consumers which are still interested in writing a zpl compatible
dataset (e.g. Lustre).

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2012-01-27 09:23:36 -08:00
Darik Horn
4836ec7d43 PPA 0.6.0.48-0ubuntu1 release. 2012-01-19 20:25:28 -06:00
Darik Horn
b90307e760 Disable dh_autotools-dev_updateconfig.
Calling `dh $@ --with autotools_dev` causes a FTBFS on Ubuntu 10.04
Lucid Lynx and is currently unnecessary for later releases.
2012-01-19 20:22:16 -06:00
Darik Horn
cdcde1e8b9 Merge branch 'upstream' 2012-01-19 20:21:20 -06:00
Prakash Surya
ff998d804f Ignore dataset if the dds_type is DMU_OST_OTHER
Since the zpios and potentially other ZFS tests use the
DMU_OST_OTHER type to label their datasets, the zpool and
zfs commands should gracefully handle this type when it is
encountered.  This patch modifies the commands' behavior
to ignore any datasets with a dds_type of DMU_OST_OTHER.

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #536
2012-01-19 09:29:48 -08:00
Darik Horn
7c97751e1c PPA 0.6.0.47-0ubuntu1 release. 2012-01-18 20:41:55 -06:00
Darik Horn
21d4841c7a Delete packaging for recomposed libraries.
The spl, avl, efi, share, and unicode libraries are now part of the
uutil, nvpair, zpool, and zfs libraries.

See zfsonlinux/zfs@750562833f.
2012-01-18 20:37:01 -06:00
Darik Horn
37316e2f83 Remove distdir stubbing for DKMS module sources.
Instead of implementing a new distir_modules rule, the existing
distdir rule is now patched to create the DKMS module sources in a
way that minimizes the number of Lintian complaints.
2012-01-18 20:26:25 -06:00
Darik Horn
8acb1ede58 Refresh debian/patches after upstream merge. 2012-01-18 20:25:37 -06:00
Darik Horn
3daa19f9b1 Add libtool to Build-Depends.
The libtool package provides macros that are used by autogen.
2012-01-18 18:25:53 -06:00
Darik Horn
c8731b3425 Merge branch 'upstream' 2012-01-18 18:25:31 -06:00