napp-it SE Solaris/Illumos Edition
- ohne Support frei nutzbar
- kommerzielle Nutzung erlaubt
- kein Speicherlimit
- freier Download für End-User
napp-it cs client server
napp-it SE und cs
- Individualsupport und Beratung
- Bugfix/ Updates auf neueste Versionen oder Fehlerbehebungen
- Redistribution/Bundling/Installation im Kundenauftrag erlaubt
Fordern Sie ein Angebot an.
Details: Featuresheet.pdf
Tuning/ best use:
In general:
- Use mainstream hardware like Intel server chipsets and nics, SuperMicro boards or LSI HBA in IT mode
- Use as much RAM as possible (nearly all free RAM is used for read caching)
to serve most reads from RAM instead of slow disks
- Add a SSD for an additional read cache (L2Arc) but only if you cannot add more RAM
Check ARC usage: home >> System >> Statistics >> ARC If you want to cache sequential data, set zfs:l2arc_noprefetch=0 see https://storagetuning.wordpress.com/2011/12/01/zfs-tuning-for-ssds/
- Do not fill up SSDs as performance degrades. Use reservations or do a
"secure erase" on SSDs that are not new, followed by overprovision SSDs
with Host protected Areas, read http://www.anandtech.com/show/6489/playing-with-op
tools: http://www.thomas-krenn.com/en/wiki/SSD Over-provisioning using hdparm or http://www.hdat2.com/files/cookbook v11.pdf
Google: Enhancing the Write Performance of SSDs
- Disable sync write on filesystems.
If you need sync write or wish to disable write back cache (LU) for data-security reasons:
Add a dedicated Slog as ZIL device with low latency, prefer DRAM based ones like a ZeusRAM or a fast (best SLC) SSD with a supercap, use a small partition of a large SSD Examples: ZeusRAM SAS SSD (DRAM based, fastest at all) Intel S3700 100/200GB with a supercap included (about 60% of the performance of a ZeusRAM) see benchmarks about sync write performance at http://napp-it.org/doc/manuals/benchmarks.pdf
Enterprise class Powerloss Protection that guarantees commited writes to be on disk is a must for an Slog.
- Add a ZIL accelerator when secure sync write is required, check ZIL usage: home >> System >> Statistics >> ZIL
constantin.glez.de/blog/2010/07/solaris-zfs-synchronous-writes-and-zil-explained read about speed degration on SSD: www.ddrdrive.com/zil accelerator.pdf read about basics of a ZIL accelerator: http://www.open-zfs.org/w/images/9/98/DDRdrive zil rw revelation.pdf
some benchmarksabout quality of SSDs as a journal device http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
Effect of SSD overprovisioning and write saturation on steady load: http://www.tomshardware.com/reviews/sandisk-x210-ssd-review,3648-6.html
- Disable ZFS property atime (log last access time)
- Use as much vdevs as possible if you need I/O. I/O performance scales with number of vdevs.
so use 2/3 way mirrorred vdevs if you need best I/O values ex for ESXi datastores or multiple Raid-Z2/Z3 for a fileserver (similar Raid-60+)
- Use as much disks as possible if you need sequential performance that scales with number of disks
but avoid too large vdevs (like max 10 disks in Z2 or 16 in Z3 due to resilver time) Try to combine this with number of vdevs. Good example: 24 disk case: use 2 x 10 disk Raid-Z2 + hotspare + opt. Zil + opt. Arc If you need more space: replace one vdev with larger disks (disk by disk, resilver), optionally later: replace the other vdev. Ignore that your pool is unbalanced in the meantime.
- Prefer ashift=12 vdevs even with older 512B disks or you have problem replacing them with newer 4k disks
To do you can modify sd.conf or create a ashift=12 vdev with newer disks and replace the disks with older 512B ones. If one disk in a vdev is 4k it creates a vdev with ashift=12 too, try this with a testpool first.
- Prefer Enterprise class SSD only pools for best performance or add a really fast write optimized ZIL with a supercap with desktop SSDs. For professional usage you can combine a pool build from of Intels new S3500 with a ZIL on a faster S3700 (100 or 200GB)
read: http://www.anandtech.com/show/6433/intel-ssd-dc-s3700-200gb-review ,prefer SSDs with a large spare area like the S3700 or do not use the whole SSD i.e. do not fill a SSD pool above say 70% , read http://www.anandtech.com/show/6489/playing-with-op
- For a fast pool stay below 80% pool fillrate, for high performance pools, stay below 50% fillrate (set pool reservations to force)
Throughput is a function of pool-fillrate. This is the price (fragmentation) that you have to pay for the copy on write filesystem that gives you the always consistent superior filesystem with snapshots and online scrubbing http://blog.delphix.com/uday/2013/02/19/78/ http://www.trivadis.com/uploads/tx_cabagdownloadarea/kopfschmerzen_mit_zfs_abgabe_tvd_whitepaper.pdf
- Use a large data/backup/media pool and add a fast SSD only pool if you need high performance (ex ESXi datastore).
Prefer fast enterprise SSDs like Intel S3700 for heavy write usage or S3500 for mostly read usage. If you use consumer SSDs, use a reservation or extra overprovisioning ex create a HPA (host protected area of 10-20%)
- Avoid dedup in nearly any case (use it only when absolutely needed on smaller dedicated pools not on large general use pools)
- if you like compression, use LZ4 if available
Enabling compress can increase or lower performance, that depends on compressor, CPU and data.
- Tune ZFS recordsize
https://blogs.oracle.com/roch/entry/tuning zfs recordsize
- Use expander with SAS disks, prefer multiple HBA controller when using Sata disks (even better/faster with SAS disks)
- You may tweak the max_pending setting in /etc/system. Lower values may improve latency, larger values throughput.
Default is mostly 10 (good for fast Sata/SAS disks), lower on slow disks, set higher (30-50) on fast SSDs. http://www.c0t0d0s0.org/archives/7370-A-little-change-of-queues.html
Hardware:
- Prefer hardware similar to ZFS boxes that are sold together with NexentaStor/OI/OmniOS
(in most cases Supermicro and Intel Xeon based)
- Use LSI HBA controller or controller that comes with raidless IT mode like the 9207
or can be crossflashed to LSI IT mode (ex IBM 1015)
- Use ECC RAM (you should not reduce security level from filesystem using unreliable RAM)
- Use Intel Nics (1 GBe, prefer 10 GBe like the Intel X-540)
- Onboard Sata/AHCI is hotplug capable but this is disabled per default:
To enable add the following line to /etc/system (OmniOS)
set sata:sata_auto_online=1
Network:
Napp-In-One (Virtual Storage server)
Comstar and blocksize on filesystems
Disk Driver setting (Multipath etc)
- Check used driver for disks in menu disk-details-prtconfig
- Check disk driver config files (napp-it menu disk-details)
- Force 4k alignment in sd.conf (disk-details)
Timeout settings (time to wait for faultet disks)
Special configurations
Dedup and L2Arc
- There are some rules of thumb that can help
Realtime dedup needs to process the dedup table on every read or write.
Dedup table is in RAM or without enough RAM on disk. In this case
performance can become horrible like a snap delete that can last days.
In a worst case scenario, you can need up to 5GB RAM for every TB of
data. Beside this RAM for dedup, you want RAM as readcache for
performance. There is also a rule of thumb. For a good performance, use
about 1 GB RAM per TB data. On some workloads you can use less, others
should have more RAM. For the OS itself, you can add about 2 GB RAM.
some exampels
If you have a 5 TB pool, you need 5 x 5 + 5 + 2 = 32 GB RAM.
Not a problem regarding RAM price
This is ok if you have a high dedup rate say 5-10 and a very expensive and fast storage.
But with a high performance pool, you probable want more readcache than only 5 GB out of 32 GB.
if you have a 20 TB pool, you need 20 x 5 + 20 +2 = 128 GB RAM
Up from here RAM becomes very expensive and without very high dedup rates (not very propable)
the RAM for dedup is more expensive than the disks that you save
AND
in most cases, you want the RAM for performance and now you use it for capacity saving.
If you would use the whole RAM as readcache this would be a much faster setup.
A fast L2Arc like an NVMe can help a little but count 5% of the L2Arc size as RAM need to manage L2Arc entries.
Another option without RAM need is LZ4 compress.
Avoid:
- Realtek nics in server or client, optionally update to newest driver releases
- Copy tools on Windows like Teracopy
ZFS Tuning options (may not apply to your OS release):
|