Recovering a Deleted File in Hadoop
There is a recovery method in Hadoop which is called "trash." It does not need to be enabled in HDFS. Trash can be enabled by setting the property fs.trash.interval.
By default the value is zero. Its value is the number of minutes after which the checkpoint gets deleted. If zero, the trash feature is disabled.
1. You will have to set this property in $BIGINSIGHTS_HOME/hdm/hadoop-conf-staging/core-site.xml
<description>Number of minutes after which the checkpoint gets deleted. If zero, the trash feature is disabled.
There is also a property (fs.trash.checkpoint.interval) that specifies the checkpoint interval. It checks the trash directory at every interval and deletes all files older than specified in fs.trash.interval. For example, if your fs.trash.interval value is 60 mins and fs.trash.checkpoint.interval is 30 mins, then every 30 mins a check is performed and deletes all files that are more than 60 minutes old.
2. Run the syncconf.sh script:
3. Stop your Hadoop server by using the stop.sh script, and then use the start.sh script to restart the Hadoop server.
4. The trash directory, by default, is /user/X/.Trash and is where you can recover a deleted item.
Creating a Directory in Hadoop
First, create the directory /user.
hadoop fs -mkdir /user
Next, create a directory with your user name.
hadoop fs -mkdir /user/yourusername