Announcement

**Ericg** · 16 February 2015, 09:51 PM

Originally posted by ihatemichael View Post

I'm running Arch on a external USB hard drive as well (regular hard drive, not flash), I'm using EXT4 as my FS.

I have no way to check the integrity of the hard drive as I can't do SMART over USB.

Now, when you say its an external usb hard drive... is it an external hard drive (read: spinning) that runs over a USB connection? Or is it a 64GB (or more) USB flash drive?

**interested** · 16 February 2015, 09:53 PM

Originally posted by ihatemichael View Post

But I've never experienced corrupted logs with rsyslog/syslog-ng and I've been using Linux for more than 15 years.

I happens all the time, you just don't notice it, just like the many dropped syslog messages you don't notice either. (syslog drops messages in many situations).

Remember, that when talking "corrupted" logs, this may only be a single malformed field value (a digit instead of a letter, or a text string shorter than expected etc). When systemd detects such a bad field value, it does a log rotate. Due to the design of the the journald reader; journalctl, it presents zero problems that there is a single invalid field value in a single log entry; it just reads past the error. It is better to mark a log entry with an invalid time stamp as "corrupted" than allowing it to mess up log watch scripts etc.

Originally posted by ihatemichael View Post

On the contrary, with journald I'm experiencing corrupted logs on a daily basis and I have to remove them manually from /var/log/journal just because they annoy me.

Stop doing verify if it bothers you so much. The errors are usually trivial. A corrupted journal isn't necessarily a corrupted file.

Sure, a tool that could inspect log entries with corrupted fields, and then delete the entry afterwards would be nice for some. No one has bothered to do it yet though.

**ihatemichael** · 16 February 2015, 09:54 PM

Originally posted by Ericg View Post

Now, when you say its an external usb hard drive... is it an external hard drive (read: spinning) that runs over a USB connection? Or is it a 64GB (or more) USB flash drive?

It's a Samsung M2 Portable (external USB hard drive (spinning) that runs over a USB connection).

**interested** · 16 February 2015, 10:07 PM

Very nice release. As expected most new features are OS container related.

My personal favourite is:
* machinectl gained support for two new "copy-from" and "copy-to" commands for copying files from a running container to the host or vice versa.

I really like the concept of also being able to maintain the containers from the "outside".

**duby229** · 16 February 2015, 10:14 PM

Originally posted by ihatemichael View Post

It's a Samsung M2 Portable (external USB hard drive (spinning) that runs over a USB connection).

i'm pretty sure badblocks will run on that drive. at minimum it can tell you if there are really bad sectors on the drive.

badblocks - Wikipedia

http://en.wikipedia.org/wiki/Badblocks

Just run badblocks /dev/sda. Replace sda with the drive you want to scan for bad sectors.

**ihatemichael** · 16 February 2015, 10:18 PM

Originally posted by duby229 View Post

i'm pretty sure badblocks will run on that drive. at minimum it can tell you if there are really bad sectors on the drive.

http://en.wikipedia.org/wiki/Badblocks

I'll try that, thanks.

**SystemCrasher** · 16 February 2015, 10:19 PM

Originally posted by Ericg View Post

Currently one a log is detected as corrupted it is removed from the rotation and marked with a "~" in the /var/log/journald directory. Whenever someone invokes journalctl to display log output it does display those corrupted logs, and it shows all salvageable data.

The problem with this approach is that "shows all salvageable data" is nice ... but what about continuing to work with log data in usual way, etc?

I think hardcore and nondestructive "fsck" can look like this: read records you can read and re-add them to NEW log file. Rebuild indexes, etc, etc. Skip damaged data. And if user wants it to be non-destructive, retain corrupted file (e.g. for more analisys and repair) and use rebuilt log file as usually. There're some caveats though, I can imagine it can be lengthy operation for large logs and imples some extra disk space usage.

then you probably have a dying hard drive.

I guess these nuts with USB flash drives just dare to unplug them while writes in progress. Flash ICs are using large blocks, it takes considerable time to erase and rewrite full block and flash drive controller can conduct background activity (e.g. garbage collection/erase of unused blocks to speed up writes). So just blindly unplugging flash drive without informing it could be really destructive idea. It can cause damage far beyound of what filesystem expects. Though filesystems who are using checksums like btrfs could be able to detect this fact at least.

Basically, on "unexpected" loss of power by flash drive...
- In some cases, if you put your partition table in stupid ways, it can happen it has been located in same erase block as something else. In unlucky case you've been about to rewrite this "something else" and then lost power. Whooooops! Partition table is gone. Because drive controller has been interrupted right in the middle of write and had no time to write partition table back. Factory formatting is well aware of this sad fact and crafted in special way: partition table must be in its own, separate eraseblock. With no anything writeable around.
- Even if it is not a partition table, similar issue can lead to one or few erase blocks missing. This could violate filesystem journalling assumptions in some cases so filesystem would think its okay. But there could be some few blocks which were erased and do not contain data filesystem expects to be here. It is really firmware specific what certain drive considers as completed write when it reports back to host and it is a big question if data reached actual storage area in IC at this point and if drive updated all internal structures to reflect it.
- Sometimes, power loss can go as bad as to cause damage to structures required for internal drive operations. If drive controller has been updating internal flash translation tables and somesuch and power has been lost at this point, everything can happen. Drive could be completely bricked if controller can't startup with damaged translation tables. There could be massive data loss, stale or incorrect data, whatever. Sometimes it could be something strange, e.g. writing to some areas of drive can suddenly fail when controller faces badly damaged translation table.

So, it's not like if someone can expect too much reliability from flash drive. Proper SSDs would at least show SMART where it's possible to get idea how it performs. USB flash drives... they usually have poor quality firmware and can suddenly die without warning. Especially if one dares to unplug them at "wrong" time.

**BradN** · 16 February 2015, 10:24 PM

Originally posted by ihatemichael View Post

I'm running Arch on a external USB hard drive as well (regular hard drive, not flash), I'm using EXT4 as my FS.

I have no way to check the integrity of the hard drive as I can't do SMART over USB.

I had some Seagate USB3 4TB "expansion" (name) hard drives, and at first I thought I couldn't run smartctl over them, but I found that using "smartctl -d sat -a /dev/sdx" would make it work. Note that the driver name is "sat", not "sata". It's worth trying some of the other drivers smartctl has as well.

**alaviss** · 16 February 2015, 10:26 PM

Originally posted by SystemCrasher View Post

The problem with this approach is that "shows all salvageable data" is nice ... but what about continuing to work with log data in usual way, etc?

I think hardcore and nondestructive "fsck" can look like this: read records you can read and re-add them to NEW log file. Rebuild indexes, etc, etc. Skip damaged data. And if user wants it to be non-destructive, retain corrupted file (e.g. for more analisys and repair) and use rebuilt log file as usually. There're some caveats though, I can imagine it can be lengthy operation for large logs and imples some extra disk space usage.

I guess these nuts with USB flash drives just dare to unplug them while writes in progress. Flash ICs are using large blocks, it takes considerable time to erase and rewrite full block and flash drive controller can conduct background activity (e.g. garbage collection/erase of unused blocks to speed up writes). So just blindly unplugging flash drive without informing it could be really destructive idea. It can cause damage far beyound of what filesystem expects. Though filesystems who are using checksums like btrfs could be able to detect this fact at least.

Basically, on "unexpected" loss of power by flash drive...
- In some cases, if you put your partition table in stupid ways, it can happen it has been located in same erase block as something else. In unlucky case you've been about to rewrite this "something else" and then lost power. Whooooops! Partition table is gone. Because drive controller has been interrupted right in the middle of write and had no time to write partition table back. Factory formatting is well aware of this sad fact and crafted in special way: partition table must be in its own, separate eraseblock. With no anything writeable around.
- Even if it is not a partition table, similar issue can lead to one or few erase blocks missing. This could violate filesystem journalling assumptions in some cases so filesystem would think its okay. But there could be some few blocks which were erased and do not contain data filesystem expects to be here. It is really firmware specific what certain drive considers as completed write when it reports back to host and it is a big question if data reached actual storage area in IC at this point and if drive updated all internal structures to reflect it.
- Sometimes, power loss can go as bad as to cause damage to structures required for internal drive operations. If drive controller has been updating internal flash translation tables and somesuch and power has been lost at this point, everything can happen. Drive could be completely bricked if controller can't startup with damaged translation tables. There could be massive data loss, stale or incorrect data, whatever. Sometimes it could be something strange, e.g. writing to some areas of drive can suddenly fail when controller faces badly damaged translation table.

So, it's not like if someone can expect too much reliability from flash drive. Proper SSDs would at least show SMART where it's possible to get idea how it performs. USB flash drives... they usually have poor quality firmware and can suddenly die without warning. Especially if one dares to unplug them at "wrong" time.

I faced that problem once. Most file systems corrupt without warning but only BTRFS reported the csum errors. Even badblocks couldn't find this. After reflashing the USB firmware, everything is fine now.

**ihatemichael** · 16 February 2015, 10:29 PM

Originally posted by BradN View Post

I had some Seagate USB3 4TB "expansion" (name) hard drives, and at first I thought I couldn't run smartctl over them, but I found that using "smartctl -d sat -a /dev/sdx" would make it work. Note that the driver name is "sat", not "sata". It's worth trying some of the other drivers smartctl has as well.

Thanks, here we go:

http://sprunge.us/efMK

Announcement

Systemd 219 Released With A Huge Amount Of New Features

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment