mirror_spl-debian

mirror of https://git.proxmox.com/git/mirror_spl-debian synced 2025-08-18 08:53:24 +00:00

Author	SHA1	Message	Date
Darik Horn	95797947b9	PPA 0.6.0.86-0ubuntu1 release.	2012-11-14 20:11:08 -06:00
Darik Horn	1f34809122	Merge branch 'upstream'	2012-11-14 20:10:12 -06:00
Brian Behlendorf	e71a4534b3	SPL 0.6.0-rc12	2012-11-13 14:28:25 -08:00
Darik Horn	4aa01aee81	PPA 0.6.0.85-0ubuntu1 release.	2012-11-11 22:33:58 -06:00
Darik Horn	da09b3d85f	Merge branch 'upstream'	2012-11-11 22:32:05 -06:00
Brian Behlendorf	366346c565	Merge branch 'kmem-cache-optimization' This branch contains kmem cache optimizations designed to resolve the lockups reported in zfsonlinux/zfs#922. The lockups were largely the result of spin lock contention in the slab under low memory conditions. Fundamentally, these changes are all designed to minimize that contention though a variety of methods. * Improved vmem cached deadlock detection * Track emergency objects in rbtree * Optimize spl_kmem_cache_free() * Never spin in kmem_cache_alloc() Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> zfsonlinux/zfs#922	2012-11-08 11:09:17 -08:00
Brian Behlendorf	dc1b30224f	Never spin in kmem_cache_alloc() If we are reaping from the cache and a concurrent allocation occurs then the caller must block until the reaping is complete. This is signaled by the clearing of the KMC_BIT_REAPING bit. Otherwise the caller will be in a tight loop which takes and releases the skc->skc_cache lock. When there are multiple concurrent callers the system will thrash on the lock and appear to lock up. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 15:48:39 -08:00
Brian Behlendorf	a1af8fb1ea	Optimize spl_kmem_cache_free() Because only virtual slabs may have emergency objects and these objects are guaranteed to have physical addresses. It can be easily determined if the passed object is a virtual slab object or an emergency object. This allows us to completely optimize the emergency object free case out of the common free path. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:54:19 -08:00
Brian Behlendorf	ed3163484d	Track emergency object in rbtree In the initial implementation emergency objects were tracked on a per-cache list. The assumption was that under normal operation we would never allocate more than a handful of these objects. So the cost of walking the list during free was expected to be negligible. However real world usage has shown that emergency objects tend to be allocated in batches. A deadlock will be detected and several thousand emergency objects will be allocated before the original blocked slab allocation can complete. Therefore the original list has been replaced by a red black tree which is sorted by the memory address of each allocated object. This bounds the worst case insertion and removal time to O(log n) which minimize contention on the assoicated spin lock. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:54:19 -08:00
Brian Behlendorf	165f13c33a	Improved vmem cached deadlock detection The entire goal of performing the slab allocations asynchronously is to be able to detect when a vmalloc() deadlocks. In this case, and only this case, do we want to start allocating emergency objects. The trick here is to minimize false positives because the overhead of tracking emergency objects is far higher than normal slab objects. With that goal in mind the code was reworked to be less sensitive to slow allocations by increasing the wait time. Once a cache is is marked deadlocked all subsequent allocations which can not be satisfied with existing cache objects will immediately allocate new emergency objects. This behavior persists until the asynchronous allocation completes and clears the deadlocked flag. The result of these tweaks is that far fewer emergency objects get created which is important because this minimizes the cost of releasing them latter in kmem_cache_free(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:54:15 -08:00
Brian Behlendorf	65c2fc5a2e	Merge branch 'splat' Additional debugging, some cleanup, and an assortment of fixes to the SPLAT tests and infrastructure. Full details in the individual patches. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:49:14 -08:00
Brian Behlendorf	1112486356	splat kmem:slab_overcommit: Disabled Disable this test because it may result in an OOM event on the system which can result in the test infrastructure being killed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:48:57 -08:00
Brian Behlendorf	b8296bf3e6	splat atomic:64-bit: Create thread outside spin lock The Fedora 3.6 debug kernel identified the following issue where we create a thread under a spin lock. This isn't safe because sleeping could result in a deadlock. Therefore the lock is changed to a mutex so it's safe to sleep. BUG: sleeping function called from invalid context at mm/slub.c:930 in_atomic(): 1, irqs_disabled(): 0, pid: 10583, name: splat 1 lock held by splat/10583: Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:48:57 -08:00
Brian Behlendorf	0e149d4204	splat: Fix log buffer locking The Fedora 3.6 debug kernel identified the following issue where we call copy_to_user() under a spin lock(). This used to be safe in older kernels but no longer appears to be true so the spin lock was changed to a mutex. None of this code is performance critical so allowing the process to sleep is harmless. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:48:56 -08:00
Brian Behlendorf	df870a697f	splat: Cleanup headers Restructure the the SPLAT headers such that each test only includes the minimal set of headers it requires. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:48:56 -08:00
Brian Behlendorf	d2733258d0	Condition variable reference counts Reference count every entry and exit from the condition variable functions: cv_wait(), cv_wait_timeout(), cv_signal(), cv_broadcast(). This allows us to safely block in cv_destroy() until all consumers have been scheduled and are no longer accessing the condition variable memory. In addition poison the magic value at the start of cv_destroy() to ensure there are never any new callers after cv_destroy() is called. The consumer is responsible for ensuring this never occurs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:48:55 -08:00
Brian Behlendorf	87efc30b27	Merge remote branch 'eris/stats' Bring in support for the new KSTAT_TYPE_TXG type. This allows for additional visibility in to the txg handling. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-06 14:48:43 -08:00
Brian Behlendorf	dba79fcbf2	Add KSTAT_TYPE_TXG type Add a new kstat type for tracking useful statistics about a TXG. The new KSTAT_TYPE_TXG type can be used to tracks the following statistics per-txg. txg - Unique txg number state - State (O)pen/(Q)uiescing/(S)yncing/(C)ommitted birth; - Creation time nread - Bytes read nwritten; - Bytes written reads - IOPs read writes - IOPs write open_time; - Length in nanoseconds the txg was open quiesce_time - Length in nanoseconds the txg was quiescing sync_time; - Length in nanoseconds the txg was syncing Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-11-02 15:17:40 -07:00
Darik Horn	f7cd5c784a	PPA 0.6.0.84-0ubuntu1 release.	2012-11-02 17:08:20 -05:00
Darik Horn	d57fa05b11	Merge branch 'upstream'	2012-11-02 17:06:39 -05:00
Darik Horn	866fd4013a	PPA 0.6.0.83-0ubuntu1 release.	2012-10-28 21:51:39 -05:00
Darik Horn	4855e3e982	Disable Ubuntu 11.04 Natty Narwhal builds. Distro support for Natty ended October 28th 2012: https://lists.ubuntu.com/archives/ubuntu-announce/2012-October/000165.html	2012-10-28 21:49:28 -05:00
Brian Behlendorf	71c9f0b003	Make kstat.ks_update() callback atomic Move the kstat ks_update() callback under the ks_lock. This enables dynamically sized kstats without modification to the kstat API. * Create a kstat with the KSTAT_FLAG_VIRTUAL flag. * Register a ->ks_update() callback which does: o Frees any existing ks_data buffer. o Set ks_data_size to the kstat array size. o Set ks_data to an allocated buffer of size ks_data_size o Populate the array of buffers with the required data. The buffer allocated in the ks_update() callback is guaranteed to remain allocated and valid while the proc sequence handler iterates over the buffer. The lock will not be dropped until kstat_seq_stop() function is run making it safe for concurrent access. To allow the ks_update() callback to perform memory allocations the lock was changed to a mutex. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-23 09:36:19 -07:00
Darik Horn	745dde5454	PPA 0.6.0.82-0ubuntu1 release.	2012-10-21 23:17:49 -05:00
Darik Horn	56495472d9	Merge branch 'upstream'	2012-10-21 23:16:52 -05:00
Brian Behlendorf	1e0c2c2ccf	Linux 3.7 compat, __clear_close_on_exec() removed Commit torvalds/linux@b8318b0 moved the __clear_close_on_exec() function out of include/linux/fdtable.h and in to fs/file.c making it unavailable to the SPL. Now as it turns out we only used this function to tear down some test infrastructure for the vn_getf()/vn_releasef() SPLAT regression tests. Rather than implement even more autoconf compatibilty code to handle this we just remove the test case. This also allows us to drop three existing autoconf tests. This does mean the SPLAT tests will no longer verify these functions but historically they have never been a problem. And if we feel we absolutely need this test coverage I'm sure a more portable version of the test case could be added. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #183	2012-10-18 13:36:44 -07:00
Yuxuan Shui	bcb15891ab	Linux 3.6 compat, kern_path_locked() added The kern_path_parent() function was removed from Linux 3.6 because it was observed that all the callers just want the parent dentry. The simpler kern_path_locked() function replaces kern_path_parent() and does the lookup while holding the ->i_mutex lock. This is good news for the vn implementation because it removes the need for us to handle the locking. However, it makes it harder to implement a single readable vn_remove()/vn_rename() function which is usually what we prefer. Therefore, we implement a new version of vn_remove()/vn_rename() for Linux 3.6 and newer kernels. This allows us to leave the existing working implementation untouched, and to add a simpler version for newer kernels. Long term I would very much like to see all of the vn code removed since what this code enabled is generally frowned upon in the kernel. But that can't happen util we either abondon the zpool.cache file or implement alternate infrastructure to update is correctly in user space. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #154	2012-10-14 16:26:21 -07:00
Massimo Maggi	dea3505dff	Switch KM_SLEEP to KM_PUSHPAGE In this particular instance the allocation occurred in the context of sys_msync()->...->zpl_putpage() where we must be careful not to initiate additional I/O. Signed-off-by: Massimo Maggi <massimo@mmmm.it> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-10-11 16:22:29 -07:00
Darik Horn	551eb9d6a1	PPA 0.6.0.81-0ubuntu1 release.	2012-10-06 21:49:35 -05:00
Darik Horn	099ef7674c	Merge branch 'upstream'	2012-10-06 21:48:18 -05:00
Etienne Dechamps	bbdc6ae495	Add interface for file hole punching. This adds an interface to "punch holes" (deallocate space) in VFS files. The interface is identical to the Solaris VOP_SPACE interface. This interface is necessary for TRIM support on file vdevs. This is implemented using Linux fallocate(FALLOC_FL_PUNCH_HOLE), which was introduced in 2.6.38. For a brief time before 2.6.38 this was done using the truncate_range inode operation, which was quickly deprecated. This patch only supports FALLOC_FL_PUNCH_HOLE. This adds support for the truncate_range() inode operation to VOP_SPACE() for file hole punching. This API is deprecated and removed in 3.5, so it's only useful for old kernels. On tmpfs, the truncate_range() inode operation translates to shmem_truncate_range(). Unfortunately, this function expects the end offset to be inclusive and aligned to the end of a page. If it is not, the kernel will stop with a BUG_ON(). This patch fixes the issue by adapting to the constraints set forth by shmem_truncate_range(). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #168	2012-10-04 16:22:07 -07:00
Darik Horn	51511e6166	PPA 0.6.0.80-0ubuntu1 release.	2012-09-18 16:58:24 -05:00
Darik Horn	74931cc496	Merge branch 'upstream'	2012-09-18 16:57:28 -05:00
Darik Horn	8381643919	Add quantal to the PPA build list. Begin building packages for the Ubuntu 12.10 Quantal Quetzal beta release.	2012-09-18 16:56:16 -05:00
Brian Behlendorf	a6c6839a88	SPL 0.6.0-rc11	2012-09-18 11:28:57 -07:00
Darik Horn	2378eb26f0	PPA 0.6.0.79-0ubuntu1 release.	2012-09-17 23:36:48 -05:00
Darik Horn	a96559872e	Generate META from debian/changelog. The META file contains a version number that is baked into the kernel module and appears in the dmesg. Most users report bugs using this information. At build time, substitute the downstream version in the debian/changelog file for the upstream version in the META file. This obsoletes the volatile-version patch. Closes: dajhorn/pkg-zfs#45	2012-09-14 17:33:01 -05:00
Darik Horn	df3625fa72	PPA 0.6.0.78-0ubuntu1 release.	2012-09-14 12:41:08 -05:00
Darik Horn	015edbd1c0	Merge branch 'upstream'	2012-09-14 12:39:34 -05:00
Brian Behlendorf	3050c9314f	Switch KM_SLEEP to KM_PUSHPAGE Under certain circumstances the following functions may be called in a context where KM_SLEEP is unsafe and can result in a deadlocked system. To avoid this problem the unconditional KM_SLEEPs are converted to KM_PUSHPAGEs. This will prevent them from attempting to initiate any I/O during direct reclaim. This change was originally part of `cd5ca4b` but was reverted by `330fe01`. It always should have had its own commit for exactly this reason. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-12 12:27:09 -07:00
Brian Behlendorf	9b51f21841	Remove TQ_SLEEP -> KM_SLEEP mapping When the taskq code was originally written it seemed like a good idea to simply map TQ_SLEEP to KM_SLEEP. Unfortunately, this assumed that the TQ_* flags would never confict with any of the Linux GFP_* flags. When adding the TQ_PUSHPAGE support in commit `cd5ca4b` this invariant was accidentally broken. Therefore to support TQ_PUSHPAGE, which is needed for Linux, and prevent any further confusion I have removed this direct mapping. The TQ_SLEEP, TQ_NOSLEEP, and TQ_PUSHPAGE are no longer defined in terms of their KM_* counterparts. Instead a simple mapping function is introduce to convert TQ_* -> KM_* where needed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #171	2012-09-12 11:41:42 -07:00
Brian Behlendorf	330fe010e4	Revert "Switch KM_SLEEP to KM_PUSHPAGE" This reverts commit `cd5ca4b2f8` due to conflicts in the higher TQ_ bits which caused incorrect behavior. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-12 10:07:48 -07:00
Darik Horn	c29a90e1fb	PPA 0.6.0.77-0ubuntu1 release.	2012-09-11 16:14:36 -05:00
Darik Horn	c1590a7875	Merge branch 'upstream'	2012-09-11 16:12:20 -05:00
Chris Dunlop	dd87332f47	Remove autotools products spl_config.h.in is a generated file: remove and .gitignore it Signed-off-by: Chris Dunlop <chris@onthe.net.au> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-11 10:12:47 -07:00
Brian Behlendorf	3c60f5054c	Debug cv_destroy() with mutex held There still appears to be a race in the condition variables where ->cv_mutex is set after we are woken from the cv_destroy wait queue. This might be possible when cv_destroy() is called immediately after cv_broadcast(). We had some troubles with this previously but there may still be a small race, see commit `d599e4f`. The following patch closes one small race and improves the ASSERTs such that they log the offending value. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> zfsonlinux/zfs#943	2012-09-10 10:23:26 -07:00
Brian Behlendorf	95331f4437	Set KMC_NOEMERGENCY for zlib workspaces The workspace required by zlib to perform compression is roughly 512MB (order-7). These allocations are so large that we should never attempt to directly kmalloc an emergency object for them. It is far preferable to asynchronously vmalloc an additional slab in case it's needed. Then simply block waiting for an existing object to be released or for the new slab to be allocated. This can be accomplished by disabling emergency slab objects by passing the KMC_NOEMERGENCY flag at slab creation time. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> zfsonlinux/zfs#917	2012-09-07 14:36:26 -07:00
Brian Behlendorf	cb5c2acebb	Add KMC_NOEMERGENCY slab flag Provide a flag to disable the use of emergency objects for a specific kmem cache. There may be instances where under no circumstances should you kmalloc() an emergency object. For example, when you cache contains very large objects (>128k). Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>	2012-09-07 14:27:03 -07:00
Darik Horn	b2763311b2	PPA 0.6.0.76-0ubuntu1 release.	2012-09-06 20:05:38 -05:00
Darik Horn	0139cd7a09	PPA 0.6.0.75-0ubuntu1 release.	2012-09-05 09:38:39 -05:00

... 4 5 6 7 8 ...

1128 Commits